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SURROGATE ORPHAN LIGANDS FOR ORPHAN RECEPTORS 
BACKGROUND OF THE INVENTION 

Field of the Invention 

This invention pertains to the field of obtaining surrogate ligands that are 
functional upon orphan receptors. 

Background 

Rapid genomic DNA sequencing often uncovers new receptors, termed 
"orphan receptors," for which the cognate natural ligand(s) are unknown. Newly discovered 
orphan receptors are often assignable to a family of existing receptors for which one or more 
ligands may have already been identified and cloned, and the receptor-ligand interactions 
studied. Existing members of the ligand family, however, often show little or no binding or 
biological activity towards a new putative member of the receptor family. The elucidation of 
the biological function of an orphan receptor must generally await the identification and 
characterization of the natural cognate ligand for the orphan receptor. Similarly, upon the 
discovery of a previously unknown ligand, the elucidation of its biological function must 
await identification of its cognate receptor. 

Previously available approaches for identifying cognate ligands for an orphan 
receptor, and cognate receptors for an orphan ligand, suffer from serious drawbacks. The 
approach of rapidly cloning as many possible members of a ligand or receptor family by 
homology, for example, is likely to prove slow and tortuous due to the often large number of 
ligands or receptors in a ligand or receptor family. Thus, a need exists for improved methods 
by which one can obtain a ligand that binds to and exerts a biological activity upon an 
orphan receptor. The present invention fulfills this and other needs."**" 

SUMMARY OF THE INVENTION 

■■■ In a first embodiment, the present invention provides methods for obtaining a 
surrogate ligand for an orphan receptor. The methods involve: (1 ) creating a library of * 
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recombinant polynucleotides; and (2) screening the library to identify a recombinant 
polynucleotide that encodes a surrogate ligand that can specifically bind to a ligand binding 
domain of the orphan receptor and/or modulate the activity of the orphan receptor. 

' In presently preferred embodiments, a library of recdmbinant polypeptides is 

5 obtained by recombining at least first and second forms of a'nucleic acid, each of which 
forms encodes a ligand for a member of a receptor family, or a fragment of said ligand, 
1 wherein the first and second forms differ from each other in two or more nucleotides, to 
produce a library of recombinant nucleic acids. The receptor family is chosen based upon 
Homology to the orphan receptor of interest. The library of recombinant nucleic acids is then 

10 screened to identify a recombinant polynucleotide that encodes a surrogate ligand that can 
specifically bind to a ligand binding domain of me orphan receptor and/or modulate the 
activity of the orphan receptor. 

" In some embbdmiehts, : these methods further involve: (3) recombining at least 
one recombinant polynucleotide that encodes a surrogate ligand 1 identified in the first round 

15 of screening with a further form of the nucleic acid, which is the same or different from the 
first and second forms, to produce a further library of recombinant polynucleotides; and (4) 
screening the further library to identify at least one further optimized recombinant 
polynucleotide that encodes a surrogate ligand that can specifically bind to a ligand binding 
domain of the orphan receptor and/or modulate the activity of the receptor. The recombining 

20 and screening steps are repeated, as necessary, until the surrogate ligand encoded by the " 
further optimized recombiiiarit polynucleotide exhibits an enhanced ability to specifically 
bind to the ligand binding domain of the orphan receptor. 

In other embodiments, the screening methods involve expressing the library 
of recombinant polynucleotides, and conta^cting-the resulting library of candidate surrogate 

25 ligands with a test cell that contains a polypeptide which comprises; a) a ligand binding 

domain of the orphan receptor (which can be an exlaoellular domain of the receptor); and b) 
a cytoplasmic and/or DNA binding domain of a second receptor, whereby the binding of a 
! . . ligand to the ligand binding domain of the peptide results in a detectable effect on the test 
cells. The surrogate ligand typically exhibits an agonist function upon binding to the ligand 

30 binding domain of the orphan receptor, although in some cases an antagonist effect is 
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observed. The second receptor, is, in some embodiments, a cytokine receptor such as, for 
example, an interleukin receptor,, an interferon receptor, a chemokine receptor, a 
hematopoietic growth, factor receptor, a tumor necrosis factor receptor, and a transforming 
growth factor. The DNA binding domain can also be obtained from the orphan ligand itself 
5 (i'.e.„the entire orphan ligand is used in the screening assay). , 

:., *. The. invention also provides methods of identifying a surrogate ligand by 
expressing a library of recombinant polynucleotides to obtain a library of candidate surrogate 
.,. ligands, and screening the candidate. surrogate ligands using a reporter gene system. For 
... example, the candidate surrogate ligands can be ^.CQntacted wiA a test cell that includes: 
10 . ;: a) a fusion polypeptide comprising: 1) a ligand^bmding domain of Ae orphan 

receptor; and 2) a DNA binding domain of [a. second receptor; an4 

b) a reporter gene construct which comprises a response element to which the 
DNA binding domain can bind, wherein the response element is operaply linked to a 
} promoter that is operative in the cell, and the promoter is operably linked to a reporter gene. 
1 5 , The screening involves determining whether the reporter gene is expressed at a higher or 
; lower level in the presence of a candidate surrogate ligand compared to expression in the 
absence of the candidate surrogate ligand. In these embodiments, the DNA binding domain 
can be, for example, a GAL4 DNA binding domain, or can be obtained from a receptor such 
as, for example, an estrogen receptor, a progesterone receptor, a glucocorticoid receptor, an 
20 . androgen receptor, a mineralcorticoid receptor, a vitamin D receptor, a retinoid receptor, and 
a thyroid hormone receptor, or can be from the orphan receptor itself if a response element 
for the orphan receptor is known. ......... 

brief Description of tiie figures 

Figures 1A and IB show 

25 shuffled human interferons. Figure 1 A shows the amino acid sequences of seven evolved 
EFN-as and the eight native Hu-IFN-as froni which they are derived are shown. The most 
parsimonious genealogies of the shuffled IFN-as are shown schematically. Recombination 
junctions are shown at the midpoint between two ami acids derived from different parental 
genes. The gene segments are colored according to which parental gene they are derived 

30 from (Hu-IFN-al, red; Hu-IFN-a5, green; Hu-IFN-d8, yellow; Hu-IFN-a 16* purple; Hu- 
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IFN-a 17, orange; Hu-IFN-aF, blue; Hu-IFN-aH, gray): Amino acids that arose by point 
mutation during DNA shuffling are circled. 

Figure IB shows the amino acid sequence of one of the cycle two chimeras, 
IFN-a-CH2.2, which is aligned with the most potent human and mouse IFN-as, Hu-IFN-al 
5 and Mu-IFN-a4. The IFN-a residues that putatively contact the IFN-a receptor (Fish, E. N. 
(1992) ./ Interferon Res. 12(4):257-66; Uze et al. (1994) J. Mol. Biol. 243(2): 245-57) are 
boxed. Residues in Hu-IFN-a 1 that have been shown by site directed mutagenesis to 
contribute to activity on mouse cells (Horisberger, M. A., and Di Marco, S. (1995) 
Pharmacol tier. 66(3): 507-3411; Weber et al. (l9S7) EMBO. J. 6(3):591-8; Fish, supra., 
10 Uze et al., supra.) are shaded. 

Figure 2 shows the antiviral Activities of native IFN-as and an evolved IFN- 
a. The results from the antiviral assay oh murine L929 cells of Hu-IFN-a2a, Hu-IFN-a 1 , 
Mu-IFN- a4 and IFN-a-CH2. 1 are shown! The dashed lines indicate the IFN-a dbse 
" -corresponding to half-maximal protection (one unit/ml), the assays were done in triplicate 
1 5 and the standard errors (% of the estimated Units; Table i) are: Mu-EFN-a4, 24%; Hu-IFN- 
al, 6%; Hu-IFN-a2a, 17%; IFN-a-CH2.1, 15%. 

Figure 3 shows a summary of the antiviral activities of native and evolved 
IFN -as on murine L929 cells. The antiviral activities of purified CHO protein for native Mu- 
IFN as, native Hu-IFN-as and evolved IFN-as on murine L929 cells are shown. One unit of 
20 activity corresponds to half-maximal protection from a lethal ECMV viral challenge. The 
arrows on the right indicate the fold improvement of IFN-d-CH2.3 relative to Hu-IFN-d 1 
and Hu-EFN-a2a. The activities of the proteins were measured in four independent 
experiments, and the rank orders of the clones is the same in all four assays, with the 
exception of assay #3 in which Mu-IFN-a4 exceeded the activity of IFN-a-CH 1.1, but not 
25 the round two evolved IFN-as. 

3 ' ' Figure 4 provides a stmcturai modeling model of the alpha carbon backbone 

of IFN-a-CH2.2, based on theNMR structure of Hu-IFN-a2a (Scarozza et al. (1992) J. 
Interferon Res. 12: 3542)! The protein backbone is colored to indicate the native Hu-IFN-a 
segment from which it is derived (jlesidues 29-39, 121-140 Hu-IFN-al, red; Residues 46- 

30 120 Hu-IFN-a5, green; Residues 40-45 Hu-IFN-a8, yellow; Residues 1-28 Hu-IFN-aF, 

4 
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, blue; Residues 141-1 66 f Hu-IFN-aH, gray). The side chains of putative murine IFN-a 
receptor contacting residues K121 and R125 are shown. 

DETAILED DESCRIPTION 

Definitions 

5 The term "cytokine" includes, for example, 

, chemokines, hematopoietic growth factors, tumor necrosis factors and transforming growth 
factors. In general these are small molecular weight proteins that regulate maturation, 
, activation, proliferation and differentiation of the cells of the immune system. 

A "surrogate ligand" is a polypeptide that can bind to a receptor for which the 
10 surrogate ligand is not a naturally occurring cognate ligand, and thus typically mediate a 
. biological effect. In. some instances, the receptor to which the surrogate ligand binds is an 
, orphan receptor for which no cognate ligand is known; in other instances, the receptor has 
...... one or more known cognate ligands but the surrogate receptor has a differential binding 

and/or biological mediating effect compared to a naturally occurring cognate ligand. 
15 Conversely, a "surrogate receptor" is a polypeptide that can act as a receptor for a ligand for 
which the polypeptide is not a naturally occurrmg cognate receptor. Again, the ligand can be 
an orphan ligand for which no known cognate receptors are known, or can be a ligand for 
which one or more cognate receptors are known but which exhibits a differential binding 
and/or biological mediating effect compared to a naturally occurring cognate receptor. 
20 An "orphan receptor" is a putative receptor polypeptide for which a naturally 

occurring cognate ligand is not known at the time of the development of a surrogate ligand. 
Similarly, an "orphan ligand" is a putative ligand polypeptide that is believed to exhibit 
binding affinity for a receptor, and thus mediation of a biological effect, where the receptor 
is not known at the time a surrogate receptor is obtained using the methods of the invention. 
. 25 An orphan receptor or an orphan ligand is said to exhibit homology to a 

known receptor or ligand, respectively, when the orphan receptor or ligand has one or more 
features that distinguish the known receptor or ligand from receptors or ligands of other 
families. For example, the orphan receptor can have a high degree of amino acid sequence 
. similarity to the known over all or part of the polypeptide. Generally, when an orphan 

5 
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receptor is classified on the basis of amino acid sequence similarity, the orphan receptor will 
be at least about 60% identical to the amino acid sequence of a corresponding domain of at 
least one member of a known receptor family. More preferably, the orphan receptor will be 
at least about 70% identical, still more preferably at least about 1 80% identical, and even 
5 more preferably at least about 90% identical to the corresponding domain of the known 

receptor. Another way to identify whether ah orphan receptor exhibits homology to a known 
receptor (or ah orphan ligand exhibits homology to a known ligand) is by determining 
whether the orphan receptor or ligand shares a primary sequence motif with members of a 
family of known receptors or ligands. Motifs of different receptor families are well known to 

10 those of skill in the art (e.g., C-X-C, C-C for chemokiries). Yet another indication that an 
orphan receptor might belong to a particular receptor family is that the structure of the 
orphan receptor shares features with the known receptors; For example, an Ig fold, an MHC 
fold, and the like, can provide 'information as to Which family of receptors an orphan 
receptor is likely to be a member. 

15 The term Screening" describes, in general, a process mat identifies 

polypeptides that function as surrogate ligands of surrogate receptors. Several properties of 
the respective molecules can be used in selection and screening mcluding, for example, 
ability to bind to a ligand binding domain of the orphan receptor. The binding is preferably 
accompanied by modulation of an activity (e.g., enhanced or reduced expression of a 

20 reporter gene that is responsi ve to a DNA binding domain or intracellular domain of a 

second receptor to which the orphan receptor ligand binding domain is attached. Selection is 
a form of screening in which identification and physical separation are achieved 
simultaneously by expression of a selection marker, which, in some genetic circumstances, 
allows cells expressing the marker to survive while other cells die (or vice versa). Screening 

25 markers include, for example, luciferase, beta-galactosidase and green fluorescent protein: 
/ Selection markers include drug and toxin resistanc - genes, and the like. Although 
spontaneous selection can and does occur iri the course of natural evolution, in the present 
methods selection is performed I by man.' 

A "exogenous E)NA segment", "heterologous sequence" or a "heterologous 

30 nucleic acid", as used herein, is one that originates from a soiirce'foreigri to a particular host 
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cell, or, if from the same source, is modified from its original form. Thus, a heterologous 
gene in a host cell includes a gene that is endogenous to the particular host cell, but has been 
modified. Modification of a heterologous nucleic acid in the applications described herein 
. typically occurs through the use of DNA shuffling. Thus, the terms refer to a DNA segment 
5 which is foreign or heterologous to the cell, or homologous to the cell but in a position 
. within the host cell genome at which the element is not ordinarily found. Exogenous DNA 
segments are expressed to yield exogenous polypeptides (i.e., polypeptides that are not 
native to the host cell, or are native to the host cell but are in modified form compared to the 
- natural form of the polypeptide). . 
10 f ; ... _ ; The term ''isolated", when applied to a nucleic acid or protein, denotes that 

the nucleic acid or protein is essentially free of other cellular components with which it is 
associated in the natural state. In particular, an ''isolated gene" or an "isolated nucleic acid" 
is separated from open reading frames which flank the gene in its natural chromosomal 
location and encode a protein other than the gene of interest. An "isolated" polypeptide or 
1 5 nucleic acid is preferably in a homogeneous state although it can be in either a dry or 
aqueous solution. Purity and homogeneity are typically determined using analytical 
chemistry techniques such as .polyacrylamide gel electrophoresis or high performance liquid 
chromatography. A protein or nucleic acid which is the predominant species present in a 
preparation is said to be "substantially purified." The term "purified" denotes that a nucleic 
20 acid or protein gives rise to essentially one band in an electrophor-etic gel. Particularly, it 
means that the nucleic acid or protein is at least about 50% pure, more preferably at least 
about 85% pure, and most preferably at least about 99% pure. 

The term "naturally-occurring" is used to describe an object that can be found 
in nature as distinct from being artificially produced by man. For example, a polypeptide or 
25 ; polynucleotide sequence that is present in an organism (including viruses, bacteria, protozoa, 
insects, plants or mammalian tissue) that can be isolated from a source in nature and which 
has not been intentionally modified by man in the laboratory is naturally-occurring. 

The term "nucleic acid" refers to deoxyribonucleotides or ribonucleotides and 
. polymers thereof in either single- or double-stranded form. Unless specifically limited, the 
30.. term,. encompasses nucleic acids containing known analogues of natural nucleotides which 
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have similar binding properties as the reference nucleic acid and are metabolized in a manner 
similar to naturally occurring nucleotides. Unless otherwise indicated, a particular nucleic 
acid sequence also implicitly encompasses conservatively modified variants thereof {e.g. 
degenerate codon substitutions) and complementary sequences, as well as the sequence 
5 explicitly indicated. Specifically, degenerate codon substitutions may be achieved by 

generating sequences in which the third position of one or more selected (or all) codons is 
substituted with mixed-base and/or deoxyinosine residues (Baizer et at. (1991) Nucleic Acid 
Res. 19: 5081; bhtsuka et al. (1985) J. Biol them. 260: 2605-2608; Cassbl et at. (1992) ; 
Rossolini et al. (1994) Mot. Cell Probes 8: 91-98). 

10 The term nucleic acid is used interchangeably with gene, cDNA, and mRNA. 

Accordingly, the term "gene" is used broadly to refer to any segment of DNA associated 
with a biological function. Thus, genes include coding sequences arid/or the regulatory 
sequences required for their expression. Genes also include nonexpressed DNA segments 
that, for example, form recognition sequences for other proteins. Genes can be obtained from 

15 a variety of sources, including cloning from a source of interest or synthesizing from known 
or predicted sequence information, and may include sequences designed to have desired 
parameters. 

**Nucleic acid derived from a gene" refers to a nucleic acid for whose 
synthesis the gene, or a subsequence thereof, has ultimately served as a template. Thus, an 
20 mRNA, a cDNA reverse transcribed from an mRNA, an RNA transcribed from a gene or 
cDNA, a DNA amplified from the gene or cDNA, an RNA transcribed from the amplified 
DNA, etc., are all derived from the gene and detection of such derived products is indicative 
of the presence and/or abundance of the original gene and/or gene transcript in a sample. 

A nucleic acid is "operably linked" when it is placed into a functional 
25 relationship with another nucleic acid sequence. For instance, a promoter or enhancer is 
* operably linked to a coding sequence if it increases the transcription of the coding sequence. 
Operably linked means that the DNA sequences being linked are typically contiguous and, 
where necessary to join two protein coding regions, contiguous and in reading frame. 
However, since enhancers generally function when separated from the promoter by several 
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kilobases and intronic sequences may be of variable lengths, some polynucleotide elements 
may be operably linked but not contiguous. 

The term "recombinant" when used with reference to a cell indicates that the 
cell replicates a heterologous nucleic acid, or expresses a peptide or protein encoded by a 
5 heterologous nucleic acid. . Recombinant cells can contain genes that are not found within 
the native (non-recombinant) form of the cell. Recombinant cells can also contain genes 
found in the native form of the.cell wherein the genes are modified and re-introduced into 
the cell by artificial means. The term also encompasses cells that contain a nucleic acid 
endogenous to the cell that has been modified without removing the nucleic acid from the 

10 , cell; such modifications include those obtained by gene replacement, site-specific mutation, 
and related techniques. ( 

A "recombinant expression cassette" or simply an "expression cassette" is a 
nucleic acid construct, generated recombinantly or synthetically, with nucleic acid elements 
that are capable of effecting expression of a structural gene in hosts compatible with such 

15 sequences. Expression cassettes include at least promoters and optionally, transcription 

termination signals. Typically, the recombinant expression cassette includes a nucleic acid 
to be transcribed (e.g., a member of a library of recombinant polynucleotides), and a 
promoter. Additional factors necessary or helpful in effecting expression may also be used 
as described herein. For example, an expression cassette can also include nucleotide 

20 sequences that encode a signal sequence that directs secretion of an expressed protein from 
the host cell. Transcription termination signals, enhancers, and other nucleic acid sequences 
that influence gene expression, can also be included in an expression cassette. 

A "recombinant polynucleotide" or a ''recombinant polypeptide" is a non- 
naturally occurring polynucleotide or polypeptide that includes nucleic acid or amino acid 

25 sequences, respectively, from more than one source nucleic acid or polypeptide, which 

source nucleic acid or polypeptide can be a naturally occurring nucleic acid or polypeptide, 
or can itself have been subj ected to mutagenesis or other type of modification. The source 
polynucleotides or polypeptides from which the different nucleic acid or amino acid 
sequences are derived are sometimes homologous (i.e., have, or encode a polypeptide that 

30 encodes, the same or a similar structure and/or function), and are often from different 

9 
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isolates, serotypes, 'strains; species, of organism of from different disease states, for example. 
A recombinant ligand, for example, will have amino acids from more than one naturally 
occurring ligand. Vv i! ' . : - 

The terms "identical'' or percent "identity," in the context of two or more 
5 nucleic acid or polypeptide sequences, refer to two or more sequences or subsequences that 
are the same or have a specified percentage of aminoacid residues or nucleotides that are the 
same, when compared and aligned for maximum correspondence, as measured using one of 
trie following sequence comparison algorithms or by visual inspection. > ; 

1 the phrase "substantially identical," in the context of two nucleic acids or 

10 polypeptides, refers to two or more sequences or subseqiiences that have at least 60%, 

preferably 80%, most preferably 90-95% nucleotide or amino acid residue identity, when 
: compared and aligned for maximum ^ coirespohdence, as measured' using one of the following 
sequence comparison algorithms of by visual inspection^ Preferably, the substantial identity 
exists over a region of the sequences that is at least about 50 residues in length, more 
1 5 preferably over a region of at least about 1 00 residues, and most preferably the sequences are 
substantially identical over at least about 150 residues. In some embodiments, the sequences 
are substantially identical over a particular domain (e.g.; an extracellular or intracellular 
domain, or a DNA binding domain or ligand binding domain), or are substantially identical 
over the entire length of the coding regions. 
20 Tor sequence comparison, typically one sequence 

'to which test sequences are compared. When using a sequence comparison algorithm, test 
and reference sequences are input into a computer, subsequence coordinates are designated, 
if necessary, and sequence algorithm program parameters are designated: The sequence 
comparison algorithm then calculates the percent sequence identity for the test sequence(s) 
25 relative to the reference sequence, based on the designated program parameters/ .■■ 

Optimal alignment of sequences for ^mparison can be conducted; e.g. t by 
the local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482 (1 98 1), by the 
homology alignment algori to 

search for smiilarify m^^ Sci. USA 85:2444 

30 (1988), by computerized Implementations ofthese algorithms (GAP, BESTFIT , FASTA; 
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and TFASTA. in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 
Science Dr. } Madison, WI), or by visual inspection (see generally Ausubel et al,, infra). 

One example of an algorithm that is suitable for determining percent 
sequence identity and sequence similarity is, the BLAST algorithm, which is described in 
5 Altschul et al., J. Mol. Biol. 215:403-410 (1990). Software for performing BLAST analyses 
is publicly available through the National Center for Biotechnology Information 
. (http://www.ncbi.nlm.nih.gov/). .This algorithm involves first identifying high scoring 

sequence pairs (HSPs) by identifying short words of length W in the query sequence* which 
. , either match or, satisfy some positive- valued threshold score T when aligned with a word of 

1 0 the same length in a database sequence. , T is referred to as the neighborhood word score , 
threshold (Altschul etal., supra). These initial neighborhood word hits act as seeds for 
initiating searches to find longer HSPs containing them. The word hits are then extended in 
both directions along each sequence for as far as the cumulative alignment score can be 
increased. Cumulative scores, are calculated using, for nucleotide sequences, the parameters 

1 5 : M (reward score for a pair of matching residues; always > 0) and N (penalty score for 

mismatching residues; always < 0). For amino acid sequences, a scoring matrix is used to 
calculate the cumulative score. Extension of the word hits in each direction are halted when: 
_ the cumulative alignment score falls off by the quantity X from its maximum achieved 
value; the cumulative score goes to zero or below, due to the accumulation of one or more 

20 negative-scoring residue alignments; or the end of either sequence is reached. The BLAST 
algorithm parameters W,.T, and X determine the sensitivity and speed of the alignment. The 
BLASTN program (for nucleotide sequences), uses , as defaults a wprdlength (W) of 1 1, an 
expectation (E), of 10, a cutoff of 100, M=5, N=^4, and a comparison of both strands. For 
. amino acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an 

25 expectation (E) of 1 0, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff ( 1 989) 
Proc. Natl. Acad. Set USA 89:10915). 

In addition to calculating percent sequence identity, the BLAST algorithm 
also performs a statistical analysis of the similarity between twp sequences -(jee, c.g., Karlin 
& Altschul ( 1 993) Proc. Nat 'l. Acad. Set USA , 90:5873-5787). One measure of similarity 

30 ; : provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an 
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indication of the probability by which a match between two nucleotide or amino acid 
sequences would occur by chance. For example, a nucleic acid is considered similar to a 
reference sequence if the smallest sum probability in a comparison of the test nucleic acid to 
the reference nucleic acid is less than about 0.1, more preferably less than about 0.01, and 
5 most preferably less than about 0*001. 

'Another indication that two nucleic acid sequences are substantially identical 
is that the two molecules hybridize to each other under stringent conditions. The phrase 
"hybridizing specifically to", refers to the binding, duplexing, or hybridizing of a molecule 
only to a particular nucleotide sequence under stringent conditions when that sequence is 
10 present in a complex mixture (e.g. , total cellular) DNA or RNA. "Bind(s) substantially" 

refers to complementary hybridization between a probe nucleic acid and a target nucleic acid 
and embraces minor mismatches that can be accommodated by reducing the stringency of 
the hybridization media to achieve the desired detection of the target polynucleotide 
sequence. 

15 "Stringent hybridization conditions" and "stringent hybridization wash 

conditions" in the context of nucleic acid hybridization experiments such as Southern and 
northern hybridizations are sequence dependent, and are different under different 
environmental parameters. Longer sequences hybridize specifically at higher temperatures. 
An extensive guide to the hybridization of nucleic acids is found in Tijssen (1993) 

20 Laboratory Techniques in Biochemistry and Molecular Biology— Hybridization with Nucleic 
Acid Probes part I chapter 2 "Overview of principles of hybridization and the strategy of 
nucleic acid probe assays", Elsevier, New York. Generally, highly stringent hybridization 
and wash conditions are selected to be about 5° C lower than the thermal melting point (T m ) 
for the specific sequence at a defined ionic strength and pH. Typically, under "stringent 

25 conditions" a probe will hybridize to its target subsequence, but to no other sequences. 
% The T m is the temperature (under defined ionic strength and pH) at which 

50% of the target sequence hybridizes to a perfectly matched probe. Very stringent 
conditions are selected to be equal to the Tm for a particular probe. An example of stringent 
hybridization conditions for hybridization of complementary nucleic acids which have more 

30 than 100 complementary residues on a filter in a Southern or northern blot is 50% 
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formamide with 1 mg of heparin at 42°C, with the hybridization being carried out overnight. 
An example of highly stringent wash conditions is 0.15M NaCl at 72°C for about 15 
minutes. An example of stringent wash conditions is a 0.2x SSC wash at 65°C for 15 
minutes {see, Sarnbrook, infra., for a description of SSC buffer). Often, a high stringency 
wash is preceded by a low stringency wash to remove background probe signal. An example 
medium stringency wash for a duplex of, eg., more than 100 nucleotides, is lx SSC at 45°C 
for 15 rrnnutes. An example low stringency wash for a duplex of, e.g., more than 100 
nucleotides, is 4-6x SSC at 40°C for 15 minutes. For short probes (e.g., about 10 to 50 
nucleotides), stringent conditions typically involve salt concentrations of less than about 1.0 
. M Na* ion, typically about 0.01 to l t .O M Na + ion concentration (or other salts) at pH 7.0 to 
. 8.3, and the temperature is typically at least about 30°C. Stringent conditions can also be 
achieved with the addition of destabilizing agents such as formamide. In general, a signal to 
noise ratio of 2x (or higher), than that observed for an unrelated probe in the particular 
hybridization assay indicates detection of a specific hybridization. Nucleic acids which do 
not hybridize to each other under stringent conditions are still substantially identical if the 
polypeptides which they encode are substantially identical. This occurs, e.g. , when a copy of 
a nucleic acid is created using the maximum codon degeneracy permitted by the genetic 
code. 

A further indication that two nucleic acid sequences or polypeptides are 
substantially identical is that the polypeptide encoded by the first nucleic acid is 
immunologically cross reactive with, or specifically binds to, the polypeptide encoded by the 
second nucleic acid. Thus, a polypeptide is typically substantially identical to a second 
polypeptide, for example, where the two peptides differ only by conservative substitutions. 

A "specific binding affinity" between two molecules, for example, a ligand 
and a receptor, means a preferential binding of one molecule for another in a mixture of 
molecules. The binding of the molecules can be considered specific if the binding affinity is 
about 1 x 10 4 M ," 1 to about 1 x 10 6 M 1 or greater. 

The phrase "specifically (or selectively) binds to" or "specifically (or 
selectively) immunoreactive with", when referring to a protein or peptide (e.g., a ligand), 
refers to a binding reaction which is detenninative of the presence of the protein, or an 
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epitope from the protein, in the presence of a heterogeneous population of proteins and other 
biologies. Thus, under designated assay conditions, the specified ligarids bind to a particular 
receptor {e.g., an orphan receptor or ah antibody) and do not bind in a significant amount to 
other proteins present in the sample. Antibodies raised against a multivalent antigenic 

'5 polypeptide will generally bind to the proteins from which brie or more of the epitopes were 
obtained. Specific binding to an' antibody under such conditions may require an antibody that 
is selected for its specificity for a particular protein: A variety of immunoassay formats may 
be used to select antibodies specifically immunoreactive with a particular protein. For 
example, solicl-phase tLISA immunoassays, Western blots^ or irnmunbhistobhemistry are 

1 0 routinely used to select monoclonal antibodies specifically immunoreactive with a protein. 
See Harlow and Lane ( 1*988) Antibodies, A Laboratory Manual, Cold Spring Harbor 
Publications, New York "Harlow and Lane"), for a description of immunoassay formats and 
conditions that can be used to determine specific immunoreactivity. Typically a specific or 
selective reaction will be at least twice background signal or noise and more typically more 

15., than 1 0 to 1 Q0 times background. . } 

"Conservatively modified variations" of a particular polynucleotide sequence 
refers to those polynucleotides that encode identical or essentially identical amino acid 
sequences, or where the polynucleotide does not encode an amino acid sequence, to 
essentially identical sequences. Because of <the degeneracy of the genetic code, a large 

20 : number of functionally identical nucleic acids encode any given polypeptide. For instance, ■« 
the codons CGU, CGC, CGA, CGG, AGA, and AGG all encode the amino acid arginine. 
Thus, at every position where an arginine is specified by a codon, the codon can be altered to 
any of the corresponding codons described without altering the encoded polypeptide. Such 
nucleic acid variations are "silent variations;" which are one species of "conservatively 

25 modified variations." Every polynucleotide sequence described herein which encodes a 
polypeptide also describes every possible silent van Hon, except where omerwise noted. 
One of skill will recognize that each codon in a nucleic acid (except AUG, which is 
ordinarily the only codon for methionine) can be modified to yield a functionally identical 
V molecule by standard techniques. Accordingly, each "silent variation" of a nucleic acid 

30 which encodes a polypeptide is implicit in each described sequence. , 
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Furthermore, one of skill will recognize that individual substitutions, 
. . deletions or additions which alter, add. or delete a single amino acid or a small percentage of 
amino acids (typically less than 5%, more typically less than 1%) in an encoded sequence are 
"conservatively modified variations" where the alterations result in the substitution of an 
5 amino acid with a chemipally similar amino acid. Conservative substitution tables providing 
, functionally similar, amino acids are well known in the art. See, e.g., Creighton (1984) 
Proteins, W.H. Freeman and Company, for additional groupings of amino acids. In addition, 
individual substitutions, deletions or additions which alter, add or delete a single amino acid 
or a small percentage of amino acids in ah encoded sequence are also "conservatively 
10 modified variations". 

A "subsequence'' refers, to a sequence of nucleic acids or amino acids that 
comprise a part of a longer sequence of nucleic acids or amino acids (e.g., polypeptide) 
respectively. , , , ; , , , 

Description of the Preferred Embodiments ; l 

1 5 The present invention provides methods for obtaining ligands for receptors, in 

particular receptors for which cognate ligands are not yet known. The methods are also 
useful for obtaining recombinant ligands that exhibit greater or reduced binding affinity for, 
and/or biological activation of, a known' receptor, compared to the naturally occurring 
cognate ligand for the receptor. Conversely, the methods are also useful for obtaining a 

20 receptor for a ligand for which a cognate receptor is not yet known, or for which a receptor ^ 
that has greater or reduced binding affinity for, and/or biological activation of, a known 
ligand. " ' ■ ' : ' r - " :■ ' ; ■ ■. • > • 

The methods of the invention provide significant advantages over previously 
available methods of identifying ligands for newly discovered receptors, or receptors for 

25 newly discovered ligands. Unlike previously available methods, the surrogate ligands or 
surrogate receptors can be obtained relatively quickly, using a relatively small number of 
assays. The methods are scalable and generic, so they can rapidly and economically be 
applied to any receptor family of interest to obtain variants that have novel properties. 
; Moreover, little br 'no stractural mfonnation regarding the interaction between ligand and 

30 receptor is necessary in order to obtain the surrogate ligand^. ^ v ; - . • . ^ 
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The methods of the invention for obtaining a surrogate ligand for an orphan 
receptor involve creating a library of recombinant polynucleotides, which library is then 
screened to identify a recombinant polynucleotide that encodes a surrogate ligand that can 
specifically bind to a ligand binding domain of the orphan receptor. The creation of 
5 recombinant libraries, as well as screening methods are described below. 

'• A. Creation of Recombinant Libraries 
• ■ r . ; . - .The- invention involves .creating recombinant libraries of polynucleotides that 
are then screened to identify those library members that exhibit a desired property, e.g., 
ability to act as a surrogate ligand for an orphan receptor, or as a surrogate receptor for an ; 
1 0 orphan ligand; The recombinant libraries pan be created using any of various methods, as 
described below. : 1 -. .;>„■;•..■-" ■- ■ ' .. 

:•- Methods for obtairung recombinant polynucleotides an(l/pr for obtaining 

diversity in nucleic acids used as the substrates for DNA shuffling as described herein 
i include, for example, homologous recombination (FGWS98/05223; Publ. No. 
15 vW098/42727); «oligonucleotide-directed mutagenesis (for review see, Smith, Ann. Rev. 
Genet. 19: 423-462 (1985); Botstein and Shortle, Science 229: 1193-1201 (1985); Garter, 
Biochem. J. 237: 1-7 (1986); Kunkel, 'The efficiency of oligonucleotide directed 
mutagenesis" in Nucleic acids & Molecular Biology, Eckstein and . Lilley, eds., Springer 
? Verlag, Berlin (1987)); Included among these methods are oligonucleotide-directed 
20 - mutagenesis (Zoller and Smith, Nucl. Acids Res. 10: 6487-6500 (1982), Methods in Enzymol. 
100: 468-500 (1983), and Methods in Enzymol. 154: 329 r 350 (1987)) phosphothioate- 
modified DNA mutagenesis (Taylor etal, Nucl. Acids Res. 13: 8749-8764 (1985); Taylor et 
al, Nucl, Acids Res. 13:: 8765-8787 (1985); Nakamaye and Eckstein, Nucl Acids Res. 14: 
9679-9698 (1986); Sayers et al, Nucl. Acids Res. 16: 791-802 (1988); Sayers et al, Nucl 
25 Acids Res. 16: 803-814 (1988)), mutagenesis usmg ui^il-contaming templates (Kunkel, 

Proc. Nat'l. Acad. Sci. USA 82: 488-492 (1985). and Kunkel et al, Methods in Enzymol 154: 
^ 367-382)); mutagenesis using gapped; duplex DNA (Kramer et aU Nucl. Acids Res. 12: 
. 9441-9456 (1984); Kramer and Fritz, Methods in Enzymol. 154: 350-367 (1987); Kramer et 
al, Nucl Acids Res, 16: 7207.(1988)); and Fritz et al, Nucl Acids Res, 16: 6987-6999 
30 (1988)). Additional suitable methods include point mismatch repair (Kramer et al, Cell 38: 
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879-887 (1984)), mutagenesis using repair-deficient host strains (Carter et al, Nucl. Acids 
Res. 13: 4431-4443 (1985); Carter, Methods in Enzymol. 154: 382-403 (1987)), deletion 
mutagenesis (Eghtedarzadeh and Henikoff, Nucl. Acids Res. 14: 51 15 (1986)), restriction- 
selection and restriction-purification (Wells et al, Phil. Trans. R. Soc. Lond. A 317: 415-423 
5 (1986)), mutagenesis by total gene synthesis (Nambiar et al., Science 223: 1299-1301 

(1984); Sakamar and Khorana, Nucl. Acids Res. 14: 6361-6372 (1988); Wells et al, Gene 
34: 315-323 (1985); and Grundstrdm etal, Nucl Acids Res..l3: 3305-3316 (1985). Kits for 
mutagenesis are commercially available (e^g., Bio-Rad, Amersham International, Anglian 
■ Biotechnology).- * v - ■ ■ ^ • ; * • 

10 : V In a .presently preferred embodiment, the recombinant libraries are prepared 

using DNA shuffling. The shuffling and screening or selection can be used to "evolve" 
individual genes, whole plasmids or viruses, multigene clusters, or even whole genomes 
(Stemmer (1995) Bio/Technology 13:549-553); Reiterative cycles of recombination and 
screening/selection can be performed to further evolve the nucleic acids of interest. Such 

1 5 techniques do not require the extensive analysis and computation required by conventional 
methods for polypeptide engineering. Shuffling allows the recombination of large numbers 
of mutations in a minimum number of selection cycles, in contrast to traditional, pairwise 
recombination eventsY Thus, the sequence recombination techniques described herein 
provide particular advantages in that they provide recombination between mutations in any 

20 or all of these, thereby providing a very fast way of exploring the manner in which different 
combinations of mutations can affect a desired result. In some instances, however, structural 
and/or functional information is available which, although hot required for sequence 
recombination, provides opportunities for modification- of me technique. 

Exemplary formats and examples for sequence recombination, sometimes 

25 referred to as DNA shuffling, evolution, or molecular breeding, have been described by the 
present inventors and co-workers m co-pending applications U.S. Patent Application Serial 
No. 08/198,431, filed February 17, 1994, Serial No. PCT/US95/02126, filed, February 17, 
1995, Serial No. 08/425,684, filed April 18, 1995; Serial No. 08/537*874, filed October 30, 
1995, Serial No. 08/564,955, filed November 30, 1995, Serial No. 08/621,859, filed March 

30 25, 1996, Serial No. 08/621,430, filed March 25, 1996, Serial No. PCT/US96/05480, filed 
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April 18, 1996, Serial No. 08/650,400, filed May 20, 1996, Serial No. 08/675,502, filed July 
3, 1996, Serial No. 08/721, 824, filed September 27, 1996, Serial No. PCT/US97/17300, 
filed September 26, 1997, and Serial No. PCT/US97/24239, filed December 17, 1997; 
Stemmer, Science 270:1510 (1995); Stemmer al.,Gene 164:49-53 (1995); Stemmer, 
5 Bio/Technology 13:549-553 (1995); Stemmer, Proc. Natl. Acad. Sci. U.S.A. 91 :10747-10751 
: (1994); Stemmer, iVa/wre 370:389-391 (1994); Crameri et aL, Nature Medicine 2(1): 1-3 
(1996); Crameri et aL, Nature Biotechnology 14:315-319 (1996), each of which is 
incorporated by reference in its entirety for all purposes. 

The methods require at least two variant forms of a starting substrate, such as 
10 a nucleic acid that encodes a receptor, or a part of a receptor if a surrbgate ligand is desired. 
; The variant forms : of candidate substrates can show substantial sequence or secondary 
* structural similarity with each other, but they should also differ in at least two positions. The 
initial diversity between forms can be the result of natural variation, e.g., the different variant 
forms (homologs) are obtained from different individuals or strains of an organism 
1 5 (including geographic variants) or constitute related sequences from the same organism {e.g., 
allelic variations). Alternatively, the initial diversity can be induced, e.gs 9 the second variant 
form can be generated by error-prohe transcription, such as an error-prone PGR or use of a 
polymerase which lacks proof-reading activity (see Liao (1990) Gene 88:107-111), of the 
first variant form, or, by replication of the first form in a mutator strain. The initial diversity 
20 : between substrates is greatly augmented in subsequent steps of recursive sequence 
recombination. r - ; r 

Sequence recombination can be achieved in many different formats and 
permutations of formats, which share some common principles. Recursive sequence 
recombination entails successive cycles of recombination to generate molecular diversity. 
25 That is; one creates a family of nucleic acid molecules showing some sequence identity to 
. * each other but differing in the presence of mutations. In any given cycle, recombination can 
occur in vivo or in vitro, intracellular or extracellular. Furthermore, diversity resulting from 
recombination can be augmented in any cycle by applying prior methods of mutagenesis 
(e.g., error-prone PCR or cassette mutagenesis) to either the substrates or products for 
30 recombination. In some instances, a new or improved property or characteristic can be 
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achieved after only a single cycle of in vivo or in vitro recombination, as when using 
different, variant forms of the sequence, as homologs from different individuals or strains of 
an organism, or related. sequences from the same organism, as allelic variations; \ 

Often, improvements are achieved after one round of recombination and 
5 selection. However, recursive sequence recombination can be employed to.achieve still 
further improvements in a desired property, such as binding affinity for an orphan receptor 
and/or modulation of receptor activity. ; , . .,/ 

In a presently preferred embodiment, "family, shuffling" is used to create the 
: library of recombinant polynucleotides. In family shuffling,. nucleic ( acids that encode 

10 . homologous polypeptides .from- different strains, , species, or gene families are used as the : 
different forms of the nucleic acids. The nucleic ^acids can encode, for example, human and 
mouse homologs of a particular ligand (e.g., the same ligand), or different human homologs 
of a ligand: (e.g.y ligands for different receptors, within a receptor family). Or thedifferent 
forms of the nucleic acid can encode different ligands within a family, as well as homologs 

15 from different species. As genomics provides an increasing amount of sequence information, 
. it is increasingly possible to directly amplify homologs with designed primers., For example, 
; given the sequence of interferon-a genes from several species, one can. design primers for 
amplification of the homologs. The resulting fragments can then be subjected to shuffling. 

The substrate nucleic acids that are used to create the recombinant library of 

20 polynucleotides are chosen depending upon the particular application. For example, where a 
surrogate ligand is desired for an orphan receptor that is believed to be a member of a 
-; ; cytokine receptor family, polynucleotides that encode all or part of a cognate ligand for 
receptors of that cytokine receptor family are subjected to recombination. For example, 
where the orphan receptor appears to be a member of the cytokine/hematopoietic growth 

25, factor (Type I) cytokine receptor family, me starting polynucleotides can encode all or part 
of an EL-2, IL-4, or IL-6 polypeptide. Similarly, for an orphan receptor that appears, to be a 
member of the interferon (Type BE) receptor : family, nucleic acids that encode one or more of 
interferon-a, interferon-p, or interferon-t can be used as a starting substrate. For an orphan 
receptor of the TNF (Type HI) receptor family,the starting substrates can be, for example, 

30 i polynucleotides that encode tumor necrosis factor. Surrogate ligands for the lg superfamily 
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of cytokine receptors can be obtained by using IL-1 -encoding polynucleotides to make the 
recombinant library, while obtaining surrogate ligands for an orphan receptor of the seven 
transmembrane helix family can involve making a recombinant library using IL-8-encoding 
polynucleotides as me starting material 

The methods can also be used to obtain a surrogate ligand, or an improved 
ligand, for a member of a receptor family such as androgen receptors, estrogen receptors, 
glucocorticoid receptors, mineralcorticoid receptors, progesterone receptors, retinoic acid 
receptors, and thyroid hormone receptors, and the like. As discussed above, polynucleotides 
that encode one or more cognate ligands for receptors in the particular family of interest are 
used to create a library of recombinant polynucleotides, which'is then screened to identify 
those recombinant polynucleotides that encode a ligand that has specific affinity for the 
orphan receptor of interest. . ; - 

Representative, but not limiting; examples of gene families of interest, and 
representative ligands that can be shuffled to obtain surrogate ligands for orphan receptors, 
are listed in Table 1. 
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J. Chemokines 

In some embodiments, the invention provides methods of obtaining surrogate 
ligands for orphan receptors that exhibit homology to one or more types of chemokine 
receptor. These methods involve identifying a known chemokine receptor that exhibits 
5- 'homology (e.g., amino acid sequence similarity,' conserved amino acid residues, structural 
* similarity, and the like) to the orphan receptor. Nucleic acids that encode all or part of one or 
more known ligands for this known receptor are then subjected to DNA shuffling. For 
, ; example, if the orphan receptor exhibits homology to ; a Cysteine- Cysteine (C-C) chemokine 
receptor (e.g., CCR-1, -2, -3, -4, -5, -6, -7, -8; see Table 2 for examples of gene names), the 
10 shuffled ligand-encoding nucleic acids can be selected from those listed in Table 3: A 
shuffling reaction can involve two or more hpmologs of the same gene from different 
mammals (e.g., human SCYA1 shuffled with mouse SCYA1), two or more different genes 
from a single mammalian species (e.g., human SCYA1 shuffled with human SCYA2), or 

Table 2: C-C Chemokine Receptors 
Gene Name 

CKR1, CMKBR1, MIP-la/RANTES-R, HM145, LD78-R 
CMKBR1L1, MTP-la-R-like 1 

CIP^:CMKBR2^CCR2A, CGR2B, MCP1-R, JE/EIG-R 
CKR3, CMKBR3, eotaxin-R i: CMBKRlL2, MIP-la-R-like 2 
CKR4,:CMKBR4, K5-5, MIPR17 
CKR5, CMKBR5, ChemRl 3, MTP-la-R 2 
CMKBR6, STRL22, GPCR29, CKR-L3, GPR-CY4, DRY6, 
KY411,1^RG-R 

CKR7, CMKBR7, EBI1 BLR2, MIP-3b-R ; 
CKR-8, CMKBR8, TER1, ChemRl, CKR4l, GPR-CY6, 
CMKBRL2 
GPR-9-6 

GPR2, SEL226b . ,1 
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any combination thereof. 



Gene 
Symbol 



(Human) 


(Mouse) 


CCR1 


Cmkbrl 


fCCRlLl) 


Cmbkrlll 


CCR2 


Cmkbr2 


CCR3 


Cmkbrll2 


CCR4 


Cmkbr4 


CCRS 


Cmkbr5 


CCR6 




CCR7 


Cmkbr7 - 


CCR8 


Gmkbr8 


CCR9 


CmkbrlO 


GPR2 
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Gene 




Symbol 




(Human) 


(Mouse) 


SCYA1 


Scyal 


SCYA2 


Scya2 


SCYA3. 


Scya3 < 


SCYA3L1 


SCYA3L2 




SCYA4. 


Scya4 


SCYA4L 


SCYAS , 


Scya5 


SCYA6 


,Scya6 


SCYA7 


Scya7 


SCYA8 


Scya8 


fSCYA9. 


Scya9, 


SCYA10) 


Scyal 0 


SCYAI1 


Scyal 1 


fSCYA12) 


Scyal2 . 


SCYA13 


- 


SCYA14 


- 


SCYAIS 




SCYA16 


Scyal 6-ps 


XAtf 


ScyaX7- . 


SCYA18 




SCYA19 




SCYA20 


Scya20 


SCYA21 


ScyaZla, b 


SCYA22 


Scya22 


SCYA23 


- 


SCYA24 




SCYA25 


Scya25 


SCYA26 




SCYA27 


Scya27 


(clone 391) 





Table 3: C-C Chemokines 

Gene Name 

CCL1, 1-3G9, TGA3; P500, SISe ' 
CCL2, MCP-1, MCAF, JE, SMC-CF, GDCF-2 ' ^ 
CCL3,CCL3L1, LD78a, LD78b,- AT464. 1, AT464.2; G0S19-1, 
G0S19r2, MIP-la, SCLTY-5, L2G25B, MTP-laS, MTP-laP, SISa, 
SISb " ' ' ' " " . " : "' ,v - " ' " " ' ' 

LD78g, G0S19-3(pseudogene) ; "' ' ' 

CCL4, CCL4U AT744.1 , AT744:2, Act-2, G-26,HC21, H400, 
MTP-lb,LAG-l 

CCL5, RANTES, SISd \ . , ' 

CIO, MRP- 1 ' ' ' ' [""[ ' 

CCL7, MCP-3, NC28, FIC, MARC ; " ' 

bCL8, MCP-2, HC14 - f ' v '• ; ;: • ■ 

MRP-2, CCF18,MIP-lg 

CCLU, eotaxin 

CCL12, MCP-5 ..; 

CCL13, MCP-4, NCC-1, CKbiO 

CCL14, HCC-1, HCC-3, NCC-2, CKbl, MCIF / 

CCL15, HCC-2, NCC-3vMIP-5, Lkn^li MlP-ld ; 

CCL16,NCC-4, LEG, HCC-4, LMC, LCC-1, CKbl2. 

Ce^l7,TARC,ABei>.2 

CCL18, DC-CKl, PARC, MIP-4, AMAC-1, CKb7 / : 

CCL19, ELC, MIP-3b, exodus*3> CKbll 

CCL20, MTP-3a; LARC, exodus- 1 , ST38, CKb4 

CCL2ljSLC, 6Ckinei exodus%TCA4 v 6Ckine-seT (Scya21a), 

6Ckirie-l«i (Scya2lb), CKb9 

CCL22, MDC, STCP-1 , ABCD-1 , DC/B-CK 

CCL23, MIP-3, MPIF-1, CKb8, CKb8-l 

CCL24, MPTF-2, CKb6, eotaxin-2 , : 

CCL25,TECK,Ckbl5 

CCL26, SCYA26, eotaxuv3, IMAC 

CCL27, ALP, skmkinej TLC, ESkine, PESKY, CTAK 

clone 391 
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Gene Name 



Gene 
Symbol 

(Carp CC- 

ii " ■ " 

Table from the Cytokine Family Database (http://cytokine.medic.kumamoto- 
u.ac.jp/CFC/CK/CCG/CCG.html) ;^ ' 



carp CC Chemokine- 1 



To obtain a surrogate ligand for an orphan receptor that exhibits homology to 
a Cysteine-X-Cysteine (C-X-C) chemokine receptor (e.g., CXCR-1, -2, -3, or -4, and others 
listed in Table 4), the nucleic acids that are subjected to shuffling can include one or more of 
those listed in Table 5. 

... Table 4: C-X-C Chemokine Receptors . 

• '• • * '•' ■ '■■ ■ • ■ Gene-Name 



CXCR1, CMKAR1, EL8RA, IL8R1, CDW128 
CXCR2, CMKAR2, IL8RB, IL8R2 
IL8RP (pseudogene) 



Gene 




Symbol 




(Human) 


(Mouse) 


IL8RA 




IL8RB 


Cmkar2 


IL8RBP 




GPR9 


Cmkar3 


CXCR4 


Cmkar4 


BLR1 


, Blrl 



CMKAR4, LCRl,NPY3R,fusin, HM89, LESTR, NPYRL, SDF- 
1R 

CXCR5, BLR1, MDR15 
Table from the Cytokine Family Database (http://crf.medic.kumamoto- 
u.acjp/CRF/CXCR/CXCR.html) 



Gene Symbol 
(Human) 
SCYB1 
SCYB2 
SCYB3 
SCYB4. 
SCYB4V1 
(PF4-like^ 
SCYB5 
SCYB6 



Table 5: C-X-C Chemokines 

Gene Name 

(Mouse) ^ 

Grol CXCLl* GROi, GROa, MGSA-a 
Scyb2 CXCL2, GR02, GROb, MIP-2a, MGSA-b 
CXCLiGRO^GROg^MIP^b 

- •. CXCL4i GXCL4V 1 , PF4, PF4varl , PF4alt 

- . . PF4-like ' ' 
ScybS CXCL5, ENA-78, LDC, AMCF-II 

CXCL6, GCP-2, CKA-3 
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Gene Symbol 
(GCP-2 likel 

SCYB7 



Gene Name 

GCP-2 like 

CXGL7i PPBP, PPBPL1, PBP, b-TGl, b-TG2, TGB1, TGB2, 
CTAPID^ CTAP3, NAP-2, NAP-2-L1, LA-PF4, MDGF, LDGF 



(PBP-like) 
SCYB8 



DNA binding protein, SPBPBP 

CXGL8, IL-8, MDNGF, NAP-1, 3- 10C, MONAP, LUCT, AMCF- 
I..LYNAP, NAF, b-ENAP . 



SCYB9 
SCYBIO 
SCYB11 
(SCYB9B) 

SCYB12 

SCYB13 

SCYB14 

SCYB1S 

(MGSA- 
pseudo) 

(NAP-4^ 



Mig 
IfilO 



Sdfl 



CXCL9, mig, Humig 

CXCL 1 0, IP- 1 6, crg-2, mob- 1 , C7, gIP-10 
CXCL1 1, H174, b-Rl, I-TAC, IP-9 



CXCL12, SDF-la, SDF-lb, PBSF, TLSF-a, TLSF-b, TPAR1 
CXCL13, BLC, BCA-I, BLR1L, Angie - 
Scybl4 CXCL14, BRAK, NJAC 

Scybl5 CXCL15, lungkine, CINC-2b-like, weche " 



MGSA pseudogene v 

NAP-4 '■*< 
Table from the Cytokine Family Database (http://cytokihe.medic.kumamoto- 
u.ac.jp/CFC/CK/CXCG/CXCG.html) ■ , , 

Surrogate ligands for orphan receptors that exhibit homology to the CXXXC 
family of chemokine receptors {e.g., CX3CR1) can be obtained by shuffling different forms 
of nucleic acids that encode SCYD-1 {e.g., homologs of SCYD-1 from different mammalian 
species). Similarly, surrogate ligands for C chemokine-like receptors {e.g., CCXCR1 (gene 
names include Ccxcrl, XCR1, GPR5, SCM1-R) can be obtained by shuffling nucleic acids 
that encode known C chemokines, such as those listed in Table 6. 



10 



Table 6: C Chemokines 
Gene Symbol " Gene Name 

(Human) (Mouse) 

SCYC1 Lptn CLIy Lymphotactin, SCM-la, ATAC 

SCYC2 - CL2, SCM-lb 

Table from the Cytokine Family Database ^ttp://cytokine.medic.kumamoto- 
u.acjp/CFC/CK/CG/CG.html) - : f ' ^ ' v -. i - 



BNSDOCID: <WO 0O52t53A2_l_> 



PCT/US00/05764 



Chemokines that are encoded by viruses are also of interest for use in 
obtaining surrogate ligands for orphan receptors. For example, one can shuffle two or more 
viral chemokine-encoding nucleic acids listed in Table 7. 

Table, 7 : Viral Chemokine cDNAs and Corresponding GenBank Accession Numbers 



Marek's disease virus 
(Gallid herpesvirus 1) 

M89471 Eco Q protein 

U34965 EcoQ protein 

''" U34966 Eco Q protein ; 

U55025 MKT-1 unidentified 

10 Stealth virus (unclassified) 

AF145588 clone 3B516 
U27769 clone 3B654 Ml 3RP 
U27885 clone 3B33 T7 
U27908 clone 3B624 T7 
U27928 clone 3B657 T7 

Kaposi's sarcoma-associated ' 
herpes virus-HHV8 

U50138 vMIP-la 

U71366 similar to MIP-1 a 

U74585 vMIP-IA 

U75698 vMIP-I " ^ ? 

U93872 K6 . ' 

15 Kaposi's sarcoma-associated, 
herpes virus-HHV8 

AF091 347 1609-1325 
•.. U67775 : vMIP-lB 

U71365 smiilartpMIP-la 
. U75698 vMIP-n :> „ 
U93872 : K4 - .,, - -,,^ r . r 



20 



25 



Kaposi's sarcoma-associated - . 
herpes virus-HHV8 

AF091347 972-628 ; 
U75698 22185-22529 
U83351 BCK 
U93872 K4.1 

Molluscum contagiosum virus 
subtype 1 

U60315 MC148R 
U86945 H-M-N-3 



30 Molluscum contagiosum virus 
subtype 2 

U96749 MC148R2 

Murine cytomegalovirus 1 

AF 124602 CC chemokine homolog 
L32187 2747-2942 
U10326 MCK-1 (ORF HJ1) 
U68299 188029-188376 

35 Human herpesvirus-6 variant A 
strain Ul 102 ■ 
U13194 EDRF3 
X83413 U83 (EDRF3) 

• * Human herpesvirus-6 variant B 
strains CB11R Z29 and HST 
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AB021506 U83 
AF157706 herpesvirus 6B 
U92288 * H83 

Table from the Cytokine Family Database (http://cytokine.medic.kumamoto- . . . 
u.ac.jp/CFC/CKA^IRUSA^IRUS.html) , 

2. FGF Family 

To obtain surrogate Hgands ibf orphan receptors that exhibit homology to the 
5 fibroblast growth factor (FGF) receptor family, the invention involves shuffling two or more 
forms of an FGF-encoding nucleic acid. Again, one can use homologs of a single FGF 
species that are obtained from different mammals, or two or more types of FGF species from 
a single mammalian species, or a combination thereof. Genes that encode members of the 



FGF/HBGF family are listed in Table 8. > ; v : 






Table 8: FGF/HBGF Family Y f • } . ; : 


Gene 
Symbol 




Gene Name 


(Human) . _ 


(Mouse) 




FGF1 


Egfl 


fibroblast growth factor 1 (acidic), acidic FGF, heparin-binding 
growth factor-1 (HBGF-1), FGFA, beta-endothelial cell growth 
fector (ECGF-beta) 


FGF2 


Fgf2 


fibroblast growth factor 2 (basic), basic FGF, heparin binding 
growth factor-2 (HBGF-2), bFGF 


FGF3 


• Fgf3 


fibroblast growth factor 3, int?2, (murine mammary tumor virus 
integration site (v*int-2) oncogene homolog) 


FGF4</A< 
TD> 


Fgf4 ,, 


fibroblast growth factor 4, transforming gene from human stomach- 
1, hst, hst-1 , heparin-binding secretary transforming factor- 1 
(HSTF1), Kaposi's sarcoma FGF (ksFGF), K-FGF, KS3 


FGF5 




fibroblast growth factor 5, oncogene encoding fibroblast growth 
factor-related protein 


FGF6 


Fgf6 


fibroblast growth factor 6, fibroblast growth factor-related gene, hst- 

2 " .: ; • . . ' ■ 


FGF7 


Fgf7 


fibroblast growth factor 7, keratinqcyte growth factor (KGF) . 


FGF8 


Fgf8 


fibroblast growth factor 8, androgen-induced growth Factor (AIGF) 


FGF9 


Fgf9 


fibroblast growth factor 9, glia-acnvatii^ factor<GAF), FGF-9 


FGF10 


FgflO 


fibroblast growth factor 10, keratinocyte growth factor 2, KGF-2 


FGF11 


Fgfli 


fibroblast growth facto?;! 1; fibroblast growm factor hom 
factor 3 (FHF-3) 
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Gene 
Symbol 

(Human) 

FGF12 



FGF14 



Gene Name 



(Mouse) 
Fgfl2 

Fgfl3 



Fgfl4 
Fgfi5 



fibroblast growth factor 12, fibroblast growth factor homologous 
factor 1 (FHF-i) 

fibroblast growth factor 13, fibroblast growth factor homologous 
factor. 2 (FHF-2) 
fibroblast growth factor 14, fibroblast growth factor homologous 
factor 4 (FHF-4) 
fibroblast growth factor 15 
fibroblast growth factor 16 

fibroblast growth factor 17 ; • ; . . 

fibroblast growth factor 18 ' ; ' 

fibroblast growth factor 19 
XFGF-20 . r; v> , ;■■ ...... 

•fibroblast growth factor 21 ...... 

fibroblast growth factor homologous 
hypothetical 48. 1 KD protein COD 1 L4 



(FGF15) 

FGF16 •' - : 

FGF17 Fgfl7 
FGF18 Fgfl8 
FGF19 Fgfl8 
(FGF20) 
{FGF2JQ - 
(FGFH) - 
(COSDllAfr - 

Table from the Cytokine Family Database (http://cvtokine.medic.kumamoto-u.ac.iD/) 
3. IL-6 Family 

Nucleic acids that encode members of the IL-6 family can be shuffled to 
obtain surrogate ligands for orphan receptors that exhibit homology to the IL-6 receptor 
family; Suitable nucleic acids that encode members of the IL^6 family include those listed in 
Table 9. 



Table 9: IL-6 Family 
Gene Symbol . Gene Name 

(Human) (Mouse) , , ; 

IL6 H6 raterleukin 6» B-ceU stimulatory factor-2 (BSF-2), interferon-beta 2 

CSF3- Csfg colony stimulating factor 3, granulocyte colony stimulating factor 

(MGF1 - myelomonocytic growth factor 

Table from the Cytokine Family Database fhttp://cvtokine.medic.kumamoto-u.ac.ip/) , 
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4. LIF/OSM Family 

Similarly, nucleic acids that encode members of the leukemia inhibitory 
factor/oncostatin M family of ligands can be shuffled to obtain surrogate ligands for orphan 
receptors that exhibit homology to a known member of the LIF/OSM receptor family: 
5 Nucleic acids that encode LIF/OSM ligands include those listed in Table 10. 

, Table 10: LIF/OSM Family 
Gene Symbol Gene Name 

(Human) (Mouse) : . - , ; .•, • •. - 

LIF Lif leukemia inhibitory factor, cholinergic differentiation factor 

OSM Osm oncostatin M , ^ ( t ■ •-, 

Table from the Cytokine Family Database (http://cvtokine.medic.kumamoto-u.ac.ip/) , 

5. MDK/PTN Family 

To obtain surrogate ligands for orphan receptors that exhibit hornblogy to 
1 0 receptors for the MDK/PTN family of cytokines, one can shuffle nucleic acids that encode 
one or more of these cytokines,. Representative examples are shown in Table 1L 

Table 11: MDK/PTN Family 

Gene GeneName 
Symbol , : , , 

(Human) (Mouse) 

^ . midk^evretmoic acid-induced heparin-binding prbtem neurite 

m<Uc growi^--promoting factor-2 (^GF2), retinoic acid-responsive protein 

^i* 6 " midkine pseudogene 1 

pleiotropfain (PTN), heparin-binding neutrophic factor (HBNF-1), 
osteoblast specif!^ protein (OSF-1), heparin-bmding growth factor 8 
PTN ptn (HBGF-8) j heparin-bmding growthTassociated molecule (HB-GAM), 
neurite growth-promoting factor- 1 (NEGF I , osteoblas* stimulating: 

■ . factor-1) 

■ •> .Table from the Cytokine Family Database Hittp://cvtokme.medic.kumamoto-u.ac:ipA 

15 6. NGF Family 

Nucleic acids that encode members of the nerve growth factor (NGF) family 
can be shuffled to obtain surrogate ligands for orphan receptors that exhibit homology to the 
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NGF receptor family. Suitable nucleic acids that encode members of the NGF family include 
those listed in Table 12. 



Gene Symbol 



Table 12: NGF Family 

r : Gene Name 



(Human) 


i (Mouse) 




BDNF 


Bdnf 


bndn-derived neurotrophic factor 


NGFB 


' Ngfb 


Nerve growth factor, beta NGF 


NTF3 


' Nrf3 ; 


neurotropnin-3, NT-3;NGF-2 


NTFS 


Ntf5 


neurotrophin-4, neurotrophin-5, NT-4, NT- 5 


NTF6A 




neurotrophin-6 alphai NT-6 alpha 


NTF6B 




: neurotrophin-6 beta, NT-6 beta 


NTF6G ' 




neurotrophin-6 gamma, NT-6 gamina 


fNTFT) 




rieurotrbphiri-7 



,V.< ; Unclassified - '\ -' t 
Table from the Cytokine Family TSatahase (ntr p://cvtokme:m edic.kiimamoto-u.ac.ip/) 

7. TNF Family 

Nucleic acids 'that encode members ibf ^^me tumor necrosis factor (TNF) family 
can be shuffled to obtain surrogate ligands for orphan receptors that exhibit homology to the 
TNF receptor family ; Suitable nucleic acids that encode members of the TNF family include 
those listed in fable 13. 



Gene . 
Symbol 



(Human) 


(Mouse) 


TNF 


Tnf 


LTA 


Lta 


LTB ■ 


fctb : 


TNFSF3L 


TnfsGl 


TNFSF4 


Txgpll 


TNFSF5 


Tnfsf5 



Table 13: TNF Family 

"•/ Gene Name . . " 

tumor necrosis factor, TNFa (Tumor Necrosis Factor d), TNF 
supcrfamily member 2 (TNFSF2) ; , 
Lymphotoxin, Lymphotoxin a, TNF superfamily member 1 
(TNFSF1), TNFp \ 

Lymphotoxin p, TM? superfamily member 3 (TNFSF3), TNFC 
TNF superfamily member 3 (LTB)-like peptidoglycan recognition 
protein, peptidoglycan recognition protein precursor (PGRP) 
tumor necrosis factor liganid superfamily member 4, TXGP1 , OX-40 
ligand, tax-transcriptionally activated; glycoprotein 1 ligand 
tumor necrosis factor lieand suDerfamilv member 5. CD40 antieen 
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Gene. 
Symbol 

(Human) 



Gene Name 



(Mouse) 



TNFSF6 
TNFSF7 
TNFSF8 
TNFSF9 



Fasl 
TnfsfT 
TnfsfS 
TnfsfP. 



TNFSF1Q Trail, 



TNFSF11 Tnfsfll 



TNFSF12 
TNFSF13 

TNFSF14 

TNFSF1S 

TNFSF18 

TNFSF19 



TnfsfI2 



ligand, CD40LG, CD40L, TNF-related activation protein (TRAP), 
hyper-IgM syndrome, gp39 

tumor necrosis factor: ligand superfamily member 6, apoptosis 
(APO-1) antigen ligand 1, APT1LG1, Fas ligand (FASL) 
tumor necrosis factor ligand superfamily member 7, CD70 antigen, 
CD70, CD27 ligand, CD27LG, CD27L 

tumor necrosis factor ligand superfamily member 8, CD30 antigen 
ligand, CD30LG, CD30L 

tumor necrosis factor ligand superfamily member 9, 4- IBB ligand, 
4-1BBLG, CD antigen 137 ligand ! : 

tumor necrosis factor ligand superfamily member 10, Apoptosis 
ligand TRAIL, Apo-2 ligand, TNF-RELATED APOPTOSIS 
INDUCING LIGAND (TRAIL), TL2 

tumor necrosis factor ligand superfamily member 11, TNF-related 
activation-induced cytokine receptor activator of nuclear factor 
kappa B ligand (RANKL), bstedprotegerin ligand, TNF-related 
ligand (TRANCE), ODF 

tumor necrosis factor ligand superfamily member 12, TNF-related 
weak inducer' of apoptosis, TWEAK ! 
tumor necrosis factor ligand superfamily member 13 > 
tumor necrosis factor ligand superfamily member 14, LIGHT, 
lymphotoxin-beta receptor (LTbR), ligand for herpesvirus entry 
mediator (HVEML) . : 

tumor necrosis factor ligand superfamily member 15, TL1 

tumor necrosis factor ligand superfamily member 18 (TNFSF18), 
AIRTL, GITRL, glucc<;orticoidririducedTNFR-related protein 
ligand (TispSFt8),5 A|TR Ugand <TL6^ 

tumor necrosis factor ligand superfamily member 19, KE05 protein, 
FLDED-1, death effector domam-containing protein (DEDD) 
Table from the Cytokine Family Database rtitrp://cvtbkihe.medic.kumamoto-u.ac.ip/) 

8. TGF-fiFamily 

Nucleic acids that encode members of the tramforrning factor-p (TGF-P) 
family can be shuffled to obtain surrogate ligahds for orphan receptors that exhibit homology 
: to the TGFp» receptor family. Suitable nucleic acids that encode members of the TGFfJ 
family include those listed in Table 14. v , , ; • • : 



Tnfsfl9- 
pehding 



31 



SNSDOCID: <WO 00521S3A2 I _> 



WO 00/52153 PCT/US0O/0S764 

Table 14: TGFp Family 
Mullerian inhibitory substance (MIS) 

Inhibins , . .:?:;•■;.■ . :■ ■•■ — 

Bone morphogenetic proteins [4] BMP-2, BMP-3 (osteogenin), BMP-3B (GDF- 1 0), 
5 BMP-4 (BMP-2B), BMP-5, BMP-6 (VGR-1 ), BMP-7 (OP- 1 ) and BMP-8 (OP-2) 
Embryonic growth factor GDF-1 

Growth/development factor GDF-5 „ . . A . 

Growth/development factor GDF 7 3, GPF-6, GDF-7, GDF-8 (myostatin) and GDF-9 , ; 

Mouse protein nodal ' "' v " ". ; : - ■'■ ^ 

10 Chicken dorsalin-1 (dsl-1) 

Xenopus vegetal hemisphere protein Vgl 

Drosophila decapentaplegic protein (DPP-C) 

Drosophila protein screw (sew) 

Drosophila protein 60 A " ■■' 

15 Caenorhabditis elegans larval development regulatory growth factor daf-7 

Mammalian endometrial bleeding-associated factor (EBAF) 

Mammalian glial, cell line-derived neurotrophic factor (GDNF) ,. , f . 

Once the nucleic acids are shuffled, me gene products of the shuffled nucleic 
20 . acids are screened to identify those that exhibit the desired activity on the orphan receptor. 

B. Screening Methods 

A recombination cycle is usually followed by at least one cycle of screening 
or selection for molecules having a desired property or characteristic. For example, a library 
of recombinant polynucleotides can be screened to identify those that encode a polypeptide 
25 that can act as a surrogate ligand for an orphan receptor. 
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1. General considerations. 

If a recombination cycle is performed in vitro, the products of recombination, 
i.e., recombinant segments, are sometimes introduced into cells before the screening step. 
Recombinant segments can also be linked to an appropriate vector or other regulatory 
5 sequences before screening. , Alternatively, products of recombination generated in vitro are 
sometimes packaged as viruses before screening. If recombination is performed in vivo, 
recombination products can sometimes be screened in the cells in which recombination 
occurred. In other applications, recombinant segments are extracted from the cells, and 
optionally packaged as viruses, before screening. 

1 0 ■ ] 1 The nature of screening or selection depends on what property or 

characteristic is to be acquired or the property or characteristic for which improvement is 
sought, and several examples are discussed below. It is not usually necessary to understand 
the molecular basis by which particular products of recombination (recombinant segments) 
have acquired new or improved properties or characteristics relative to the starring 

1 5 substrates. Screening/selection can then be performed, for example, for recombinant 
surrogate ligands that have increased agonist activity on a target cell. that displays the 
receptor of interest without the need to attribute such improvement to any of the individual 
component sequences of the surrogate ligand. 

Depending on the particular screening protocol used for a desired property, 

20 initial round(s) of screening can sometimes be performed in bacterial cells due to high 
transfection efficiencies and ease of culture. Later rounds, and other types of screening 
which are not amenable to screening in bacterial cells, are performed in mammalian cells to 
optimize recombinant segments 'for use in ah environment close to that of their intended use. 
Final rounds of screening can be performed in the precise cell type of intended use (e.g., a 

25 human cell). 

The, screening or selection step identifies a subpopulation of recombinant 
polynucleotides that encode polypeptides that have evolved toward acquisition of a new or 
improved desired receptor binding, and/or modulatory activity. Depending on the screen, the 
recombinant polynucleotides can be identified as components of cells, components of 
30 viruses or in free form. More than one round of screening or selection can be performed after 
each round of recombination. 

33 



BNSDOCIO: <WO 0OS21S3A2_l_> 



WO 00/52153 



PCT/US00/05764 



If further improvement in a property is desired, at leastone and usually a 
* collection of recombinant polynucleotides surviving a first round of screening/selection are 
subject to a further round of recombination. These recombinant polynucleotides' can be 
recornbihed with each other or with exogenous segments representing the original substrates 

"5 or fuller variants thereof. Again, 'recombination can proceed in vitro or in vivo. If the 
previous screening step identifies desired recombinant polynucleotides as components of 
cells, the components can be subjected to further recombination in vivo, or can be subjected 
to further recombination in vitro, Or can be isolated before performing a round of in vitro 
recombihatioh. Conversely, if the previous screening step identifies desired recombinant 

1 0 polynucleotides in naked form or as components of viruses, these polynucleotides can be * 
introduced into cells to perform a round of in vivo recombination. The second round of 
recombination, irrespective how performed, generates further recombinant polynucleotides 
which encompass additional diversity than is present in recombinant segments resulting from 
previous rounds. 

15 The second round of recombination can be followed by a further round of 

screening/selection according to the principles discussed above for the first round. The 
' stringency of screening/selection can be increased between rounds; Also, the nature of the 
" screen and the property being screened for can vary between rounds if improvement in more 
than one property is desired or if acquiring more than one new property is desired. 

20 Additional rounds of recombination and screening can then be performed until the 

recombinant segments have sufficiently evolved to acquire the desired new or improved 
property or function. 

Various screening methods for particular applications are described herein. In 
some instances, screening involves expressing 'the recombinant peptides or polypeptides 

25 encoded by the recombinant polynucleotides of the library as fusions with a protein that is 
displayed on the surface of a replicable genetic pack-i*. For example, phage display can be 
used See. eg, Cwirla et aL, Proc NatL Acad: Sci. USA 87: 6378-6382 (1990); Devlin et al., 
Science 249: 404-406 (1996), Scott & Smith, Science 249: 386-388 (1990); Ladner etal., US 
5,57 1 ,698. Other replicable genetic packages include, for example, bacteria, eukaryotic 

30 viruses, yeast, and spores! 
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: The genetic packages most frequently used for display libraries are 

bacteriophage, particularly filamentous phage, and, especially phage M13, Fd and Fl. Most 
work has involved inserting libraries encoding polypeptides to be displayed into either gill 
or gVTfl of these phage forming a fusion protein., See, e.g., Dower, WO 91/19818; Devlin, 
WO 91/18989; MacCafferty, WO 92/01047 (gene III); Huse, WO 92/06204; Kang, WO 
92/18619 (gene VIII). Such a fusion protein comprises, a signal sequence, usually but not 
necessarily, from the phage coat protein, a polypeptide to be displayed and either the gene III 
or gene VTH protein or a fragment thereof. Exogenous coding sequences are often inserted 
at or near the 1^-terminus of gene III or gene. Vni although other insertion sites are possible. 
> ... •., ; , Eukaryotic viruses can be used to display polypeptides in an analogous 
manner., For example, display of human heregulin fused to gp7.0 of Moloney murine 
leukemia virus has been reported by Han et al.,Proc. Natl. Acad. Sci. USA 92: 9747-975 1 
(1995). Spores can also be used as replicable genetic packages. In this case, polypeptides 
are displayed from the outer surface of the spore. For example, spores from B. subtilis have 
been reported to be suitable. Sequences of coat proteins of these spores are provided by 
Donovan et al., J. Moh Biol r 196, 1-10 (1987). Cells can also be used as replicable genetic 
packages. Polypeptides to be displayed are inserted into a gene encoding a cell protein that 
is expressed on the cells surface. Bacterial cells including Salmonella typhimurium, Bacillus 
subtilis, Pseudomonas, aeruginosa. Vibrio cholerae, Klebsiella pneumonia, Neisseria , 
gonorrhoeae, Neisseria meningitidis, Bacteroides nqdosus, Moraxellq bovis, and especially , 
Escherichia coli are preferred. Details of outer surface proteins are discussed by Ladner et 
al., US 5,571,698 and references cited therein. For example, the lamB protein of E. coli is 
suitable. ., ... . sf • . 

A basic concept of display methods ^tliat ^^ use phage or other replicable genetic 
package is the establishment of a physical association between DNA encoding a polypeptide 
to be screened and the polypeptide. This physical association is provided by the replicable 
genetic package, which displays a polypeptide as part of a capsid enclosing the genome of 
the phage or other package, wherein the polypeptide is encoded by the genome. The 
establishment of a physical association between polypeptides and their genetic material 
allows simultaneous mass screening of very large numbers of phage bearing different 
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polypeptides. Phage displaying a polypeptide with affinity to a target, e.g., a receptor, bind 
to the target and these phage are enriched by affinity screening to the target! The identity of 
polypeptides displayed from these phage can be determined from their respective genomes. 
Using these methods a polypeptide identified as having a binding affinity for a desired target 
5 can then be synthesized in bulk by conventional means. 

2. Screening assays for surrogate ligand ot -surrogate receptor activity 
Screening of the recombinant ..libraries can involve identifying those members 
that encode a polypeptide that specifically binds to the receptor of interest . The libraries of 
recombinant polynucleotides are expressed and those that can bind to the receptor with a 
10 desired specificity. and avidity are chosen for use, or for further improvement. In presently 
preferred embodiments, the library of recombinant polypeptides are displayed on the surface 
of a repUcable genetic package. , t .... , . 

For some applications, a binding assay is sufficient to identify a surrogate 
ligand or surrogate receptor, However, in other applications, it is desirable to obtain a 
1 5 surrogate that exerts a biological activity upon binding to its orphan counterpart. The 

biological activity assay can be conducted after pre-screening using a binding assay, or can 
be used on its own without a prescreen. . 

In some embodiments, the libraries of recombinant polynucleotides are 
screened by expressing the library and contacting the resulting library of candidate surrogate 
20 : ligands with a test cell that contains the receptor of interest, or at least a sufficient portion for 
, biological activity. Suitable test cells are those that are known to allow biological activity for 
previously known members of the ligand family to which the surrogate ligand presumably 
. ,. _ belongs. ^ , . 

For receptors such as cytokine receptors, the extracellular domain of the 
25 receptor of interest is expressed as a fusion with the cytoplasmic domain of a known 

receptor. The transmembrane domain of the known receptor or of the receptor of interest can 
also be included in the fusion protein. The fusion protein is displayed on a cell that is 
permissive for the biological activity of known ligands for the receptor family to which the 
receptor of interest is presumed to belong. Upon binding of a surrogate ligand to the 
30 extracellular domain, the biological activity is observed. 
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In some embodiments, the screening methods of the invention use a oell that 
contains a polypeptide that has a ligand binding domain of the receptor of interest (e.g., an 
orphan receptor). The polypeptide will also include a DNA binding domain, which can be 
that of the orphan receptor, or more preferably is obtained from a known receptor or is a 
. 5 DNA binding domain for which the response element is known {e.g., Gal4, nuclear hormone 
receptors, and the like). Examples of suitable chimeric polypeptides are described in more 
detail above. Conveniently, the chimeric receptor polypeptide is introduced into the cell by 
expression of a polynucleotide that encodes the receptor polypeptide. For example, an 
expression vector that encodes the chimeric receptor can be introduced" into the cell that is to 
10 be used in the assay. 

For a nuclear receptor, the cells preferably also contain a response element 
that can be bound by the DNA binding domain. The response element is operably linked to a 
promoter that is active in the cell. In presently preferred embodiments, the promoter is 
operably linked to a reporter gene that, when expressed, produces a readily detectable 
15 product. The response element/reporter gene construct is conveniently introduced into cells 
as part of a "reporter plasniid." ' 

For some screening assays, it is desirable to present to the assay a standard 
amount of the ligand being tested. In such instances, one can "tail" the ligands with a 
suitable affinity tag and express the ligands in an expression system known to allow 
20 biological activity for the previously known members of the family to which the ligand 

presumably belongs. Cell extracts and/or supernatant^ that contain the expressed ligands can 
be simultaneously affinity purified in a batchwise fashion, for example, in pools, and eluted. 
The system can be calibrated such that differences in expression level of the different ligands 
(which differences are likely to occur) would not result in differences in the total amount of 
25 ligand presented in an assay. For example, one can use 10-50-fold excess ligand over the 
capacity of the affinity purification support. : 

In assays in which pools are processed, the levels of individual members 
within each pool will hot be identical. In such situations, positive pools are identified 
without concern for false negatives due to poor expression of any particular ligand surrogate. 
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.3. Screening assays to identify compounds that modulate activity of a 
surrogate ligand or surrogate receptor. 

The invention also provides screening assays for identifying compounds that 

can modulate the biological activity of a surrogate ligand or a surrogate receptor obtained 

5 using the methods of the invention. These compounds can function by, for example, altering 

the interaction between the receptors and their ligands, or between the receptors and the 

remainder of the signal transduction pathway. Compounds that are identified using the 

screening methods of the invention find use in studies of interactions between the ligand and 

receptor and in studies of signal transduction. The compounds also find therapeutic use in 

10 situations in which it is desirable to increase or decrease expression of genes that are under 
the control of a particular receptor. Other uses will also be apparent those of ordinary skill in 
. the art. . . 

In the screening methods for obtaining modulators, a test system such as 
those described above can be used. For example, host cells that contain a reporter plasmid, a 

1 5 chimeric receptor polypeptide, and the surrogate ligand are incubated in the presence of a 
test compound. Essentially any chemical compound can be used as a potential modulator in 
the assays of the invention, although most often compounds that can be dissolved in aqueous 
or organic (especially DMSO-based) solutions are used. The assays are designed to screen 
large chemical libraries by automating the assay steps and providing compounds from any 

20 convenient source to assays, which are typically run in parallel (e.g., in microliter formats on 
microliter plates in robotic assays). It will be appreciated that there are many suppliers of 
chemical compounds, including Sigma (St. Louis, MO), Aldrich (St. Louis, MO), Sigma- 
Aldrich (St. Louis, MO), Fluka Chemika-Biochemica Analytika (Buchs Switzerland) and the 

... like - • ■ - . ; , . 

25 In one preferred embodiment, high throughput screening methods involve 

providing a combinatorial Ubrary containing a large number of potential therapeutic 
compounds (potential modulator compounds). Such "combinatorial chemical libraries" are 
then screened in one or more assays, as described herein, to identify those library members 
(particular chemical species or subclasses) that display a desired characteristic activity. The 

30 compounds thus identified can serve as conventional "lead compounds" or can themselves 
be used as potential or actual therapeutics. 
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• A combinatorial 4 chemical library is a collection of diverse chemical 
compounds generated by either chemical synthesis or biological synthesis, by combining a 
number of chemical "building blocks" such as reagents. For example, a linear combinatorial 
chemical library such as a polypeptide library is formed by combining a set of chemical 
5 building blocks (amino acids) in every possible way for a given compound length (i.e., the 
number of amino acids in a polypeptide compound). Millions of chemical compounds can 
be synthesized through such combinatorial mixing of chemical building blocks. 

Preparation and screening of combinatorial chemical libraries is well known 
to those of skill in the art. Such combinatorial chemical libraries include, but are not limited 

10 to, peptide libraries {see, e.g., U.S. Patent 5,010,175, Furka, Int- J. Pept. Prot. Res. 37:487- 
493 (1991) and Houghton et al, Nature 354:84-88 (1991)) Other chemistries for generating 
chemical diversity libraries can also be used. Such chemistries include, but are not limited 
to: peptoids (PCT Publication No. WO 91/19735), encoded peptides (PCT Publication WO 
93/20242), random bio-oligomers (PCT Publication No. WO 92/00091), benzodiazepines 

15 (U.S. Pat. No. 5,288,514), diversomers such as hydantoins, benzodiazepines and dipeptides 
(Hobbs et al, Proc. Nat. AcadSci. IJSA 90:6909-6913 (1993)), viriylogous polypeptides 
(Hagihara et al , J. Amer. Chem. Soc. 1 14:6568 (1992)), rionpeptidal peptidomimetics with 
P-D-glucose scaffokHng (Hirechmann etdl.^J. 'Amer. 'them. Soc 114:9217-9218(1992)), 
analogous organic syntheses of small compound libraries (Chen et al, J. Amer. Chem. Soc. 

20 116:2661 (1994)), oligocarbamates (Cho et al, Science 261:1303 (1993)), and/or peptidyl 
phosphonates (Campbell et al, J. Org. C/iem. 59:65 fc (1994)), nucleic acid libraries (see, 
Ausubel, Berger and Sambrboki all supra), peptide nucleic acid libraries (see, e.g., U.S. 
Patent 5,539,083), antibody libraries (see, e.g., Vaughn et al', Nature Biotechnology, 
14(3):309-314 (1996) and PCT/US96/10287), carbohydrate libraries (see, e.g., Liang et al, 

25 Science, 274:1520-1522 (1996) and U.S. Patent 5,593,853), small organic molecule libraries 
(see, e.g., benzodiazepines, Baum C&EN, Jan 18, page 33 (1993); isopfenoids, U.S^ Patent 
5,569,588; thiazolidinones and metathiazanbnes, U.S. Patent 5,549,974; pyrrolidines, U.S. 
Patents 5,525,735 and 5,519,134; morpholinb compounds, UlsV Patent 5,506,337; 
benzodiazepines, 5,288,514, and the like). 
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' Devices for the preparation of combinatorial libraries are commercially 
available {see, e.g., 357 MPS, 390 MPS, Advanced Chem Tech, Louisville KY, Symphony, 
Rainin, Woburn, MA, 433A Applied Biosystems, Foster City, CA, 9050 Plus, Millipore, 
Bedford, MA). In addition, numerous combinatorial libraries are themselves commercially 
5 , available (see, e.g., ComGenex, Princeton, N.J., Asinex, Moscow, Ru, Tripos, Inc., St. 
Louis, MO, ChemStar, Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, PA, Martek 
Biosciences, Columbia, MD, <efc.)., . , , t , 

' ' " : ' : ■ ' : EXAMPLES r; : ' ; 

The following examples are offered to illusu"ate, but not to limit the present 
10 invention'.' ' '" : '" r ' " ' lV * -"■■•'■■ 4 '- 

The following abbreviations are used herein: IFN-a, alpha interferon; Hu- 
IFN a, human IFN-a; Mu-IFN-a, munne IFN-a; HTP, Mgli throughput; CHO, Chinese 
hamster ovary; EPO, erythropoietin; GM-CSF, granulocyte macrophage colony stimulating 
factor; G-CSF, granulocyte colony stimulating factor; IL, interleulori; PBS, phosphate 
15 buffered saline; CPE, cytopathic effect? '" " 

•■- . ■. i . Example 1 , . . . . .. 

RAPID EVOLUTION OF A CYTOKINE USING MOLECULAR BREEDING 
Molecular breeding is the application of classical breeding to sub-genqmic 
sequences. This approach to sequence evolution generalizes concepts from classical 
20 genetics, allowing one to selectively breed DNA sequences in the test tube. In this study, in 
vitro DNA shuffling was used to breed a family of oyer 20 human interferon alpha (Hu-IFN- 
a) genes for increased antiviral and anti-proliferation activities in murine cells. Only 68 
assays of pools of interferons were used to obtain- a clone with 135,000-fold improved- 
specific activity over Hu-IFN-a2a in the first cycle of shuffling. After a second cycle of 
25 m selective breeding, the most active clone was improved 285,000 relative to Hu-EFN-a2a. 
Remarkably, the three most active. clones are more active than the native murine IFN-as. 
These chimeras are derived from up to five parental genes, but contain no random point 
mutations. These results demonstrate that diverse cytokine gene families can be used as 
breeding stock from which to:rapidly> evolve cytokines that are more active or have superior 
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selectivity profiles than native cytokine genes. Molecular breeding provides an economical 
, alternative, to genomics-based approaches to searching for potent activities of interest in 
existing genomes, x' < . •■, 

introduction 

5 Alpha interferons are members of trie diverse helical-bundle super-family of 

cytokine genes that contains many clinically important pharmaceutical proteins such as EPO, 
GM-CSF, G-CSF, IFN-a, IFN-0, IL-2, IL-3, IL-4 and several other interleukihs (Sprang and 
Bazan (1993) Current Opinion in Structural Biology 3:815-827). While these proteins have 
, . important therapeutic value in the treatment of a number of diseases, they have not been 

10 optimized by natural selection as pharmaceuticals. For example, dose-limiting toxicity, 

receptor crossTreactivity, and short serum half-lives significantly reduce the clinical utility of 
many of these cytokines (Dusheiko, G. (1997) Hepatphgy2<&(3 Suppl 1):112S-121S; Vial 
and Descotes (1994) £)rwg Experience 10 (2): 1 15-150; Funke et al. ( 1 994) Ann. Hemqtol 
68(l):49-52; Schomberg et al. (1993)7. Cancer Res. Clin. Oncol 11?(12):745.55). . 

15 Molecular breeding provides a general method for improving these properties. . 

The cytokine super-family has evolved by a series of gene duplications and 
recombination events. For example, the d, P and co interferons are derived by ancient 
duplication of a common ancestor with subsequent recombination within the. IFN-a gene 
family (Hughes, A. L. (1995) J. MoLEvoh 41(5): 539^48); Similarly, the genes encoding 

20 IL-4 and IL-13 are in proximity in human and murine genomes and they share several, but 
not all, of their biological functions (Punnohen et al. (1993) Proc. Nat 'I. Acad. Sci. USA 
90(8):3730-4), suggesting thatthey have arisen by gene duplication. The receptors for the 
cytokine supergene family have also been generated by duplication, mutation; and . 
recombination of a few modular receptor domains (Uze etal (1995) J. Interferon Cytokine 

25 Res. 15(l):3-26; Bazan et al (1990) Prqc. Nat'l Acad. Sci, USA 87(18):6934r8). . 

The human IFN-cts are encoded by a family of over t-vciity tahdemly 
duplicated ndn-allelic genes that share 85-98% sequence identity at the amino acid level 
(Henco et al (1995) J.> Mol. Biol 18S(2):227-60). These proteins have potent antiviral and 
^anti-proliferative activities; that have great clinical utility as anticancer and antiviral - 

30 therapeutics. While* the utihty of chimeric WNs derived from this gene family has been 
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recognized (Horisberger and Di Marco (1995) Pharmacol Ther. 66(3):507-34), only a small 
fraction of the 10 26 possible chimeras have been explored either in natural human evolution 
or by the methods of modern molecular biology; and only one natural IFN-a subtype, Hu- 
. IFN-a2, has been used in extensive clinical, studies {Id.). The most active engineered IFN-a, 
IEN-Coni, is a consensus of mirteen wild type Hu-IFN-a genes that is currently being used 
in hepatitis C therapy (Blatt er a/. (1996) J. Interferon Cytokine Res. 16(7):489-99). 

DNA shuffling, or molecular breeding, is a method for permutation of natural 
genetic diversity. This technology provides a powerful tool for rapidly evolving single 
genes'/ operbns and whole i viruses ; for desired^properties (Stermiier, W; P. C. (1995) 
'Biotechnology 13: 549-555;' Patten et a/. (1996) Current Opinion in Biotechnology 8:724- 
733; Crameri W a/. (1998) Nature 15:288-91), and has many advantages relative to random 
mutation or rational sequence design: This 5 Example describes the use of family DNA 
shuffling to rapidly evbive me Hu-IFN-a gene family for activity in mouse cells. The native 
Hu-IFN-a genes are 53-65% identical to Mu-IFN-ds arid exhibit very weak activity on 
murine cells (Horisberger and Di Marco, supra ). Similarly^ the extra-cellular domains of the 
IFN-a receptors share only 49% sequence identity (Uze et al, supra.). Despite these 
sequence differences, we obtained shuffled IFN-os that are more potent in mouse cells than 
the native Mu-IFN-as. ' 

Experimental protocols . ;-. • 5i 

DNA cloning, sequencing and shuffling 

The Hu-IFN-a gene family was PCR amplified from human genomic DNA 
using twelve sets of degenerate primers. Three hundred micrograms of PCR product was 
fragmented with DNase I, 25 - 60 bp fragments were gel purified, and family shuffling of the 
fragments was performed as described (Crameri et al. (1998) Nature 15:288-91). Two 
additional libraries of shuffled Hu-IFN-a genes we. made from eight cloned Hu-IFN genes 
(Hu-IFN-as 1 , 4, 5, 6, 14, 16, 17 and F). Fragments of 25-50 or 50-100 bp were purified, 
, and shuffling was done as.described (Crameri et al., supra.). Assembled insert was cloned 
: by standard methods into the phagemid display vector pDEI-932. Hu-IFN-a-Conl was 
constructed from synthetic oligonucleotides. Hu-IFN-a? 1, 2a, 4, 5, 6, 14, 16, 17 and F; and 
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Mu-IFN-as 1, 4 and 6 were cloned from genomic DNA and sequenced on an ABI DNA 
■ ■ sequencer:. 4 , r : . ■■ , : . , , . :i ; ... *. ... .... 

DNA sequence analysis 

The extracellular domains of the human arid mouse IFN-a receptors were 
5 aligned by the Clustal method (DNA STAR; SWISS-PROT accession numbers P33896, 
P17181, P48551; GENBANK accession number AF013274). - ; T ^ 

■ Phagemid display of IFN < k . , 

;:•> For HTP primary screening of activity, shuffled. Hu-IFN-a genes were 
expressed in a biologically active form by phage display, similarly to the expression strategy , 

10 iused for other four helix bundle cytokines. « The phagemid display vector pDEI-932 is a 
standard gene HI phagemid display vector wherein the ST1I leader is fused to the amino 
terminus of Hu-IFN-a and the E-tag (Pharmacia) plus a 6-His tag is fused to the carboxyl 
terminus. Immediately following the G-terminal tag is a suppressible amber codon, followed 
by Ml 3 gene III (fused at residue 247 of gene HI). The IFN-a gene III insert is under the . 

1 5 control of the pBAD promoter, and the backbone plasmid is an Arnp R derivative of pBR322 
containing an M13, origin of replication; . Large scale (250 ml) phagemid preps were done by 
standard methods (Klaus et al (1997) J. Mol Biol. 274(4):661-75) in the presence of 
0.002% arabinose to induce expression of the IFN-a gene III fusion. Phagemids were PEG 
precipitated, CsCl banded, and dialyzed into PBS prior to assaying: 

20 HTP phagemid preparations 

For the purposes of HTP screening, E. coli harboring phagemids were picked 
with a Q-BOT robotic colony picked (Genetix) into 96-well plates containing 100 microliters 
of 2XYT per well. Confluent cultures were grown overnight at 37° G. The overnight 
cultures were diluted 20-fold into fresh 2XYT, Amp/0.002% arabinose/10 ,Q pfu/ml M13 
25 VCS helper phage arid grown for four hours' with vigorous shaking. The cells were pelleted 
and phage supefnatarits were transferred to 96-well dialysis plates containing a 1 00 
kilodaiton cutoff riiembrarie prior to assaying. Samples Were dialyzed against PBS arid then 
filter sterilized through 96-well 0A5 rhicrori membranes. Sterile pnagernid samples were 
used directly in cellular assays. ' < J ' " '"" '•' '"• - - 
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Antiviral assays 

Antiviral activities were determined by the cytopathic effect (CPE) reduction 
assay on mouse L929 cells challenged with encephalomyocarditis virus (EMCV). Briefly, 
target cells were grown to confluence, trypsinized, and distributed into 96 well flat bottom 
5 microtitre plates (10 4 cells per well) in RPMI medium supplement with 10%FGS and 
Penicillin/Streptomycin antibiotics. IFN-a samples were titrated in triplicate m 5 fold 
, dilutions. After incubation for 16 hours, the medium was removed, replaced with medium 
, r > containing EMCV (100 TCED50 per well) and the plates were incubated for 2 days until CPE 
occurred... \ Medium was removed, the cells were washed twice with PBS, and neutral red 
10; . , (1:100 dilution) was added and incubated for 2 hours. During the last, 20 minutes, cells were 
fixed with 0.5% glutaraldehyde. The unstained dye solution was removed, the plates were 
>. washed twice with PBS, and the color, was extracted with 50% methanol, 1% acetic acid. 
1 , The extracted dye solution in the well was quantitated colorinjetrically at 540 nanometers 
with a spectrophotometer. Results of the CPE reduction assay derived as above were plotted 
15 to produce sigmoidal dose-response curves by plotting the logarithm of the IFN-a 

concentration versus the cell viability . One unit/ml is defined as the interpolated IFN-a 
concentration giving 50% protection (on a scale of 0 to 100% determined by controls with 
no IFN^x and with or without virus). 

Deconvolution of libraries 

20 In cycle one, eight pools of 12 were assayed, and one had measurable 

antiviral activity. Sixteen pools of 96 were assayed,, the most active, pool of 96 was broken 
into eight pools of 12, and these pools were assayed separately, Three pools 12 had 
measurable activity, and thirty-six individual phagemids were prepared, purified and assayed 
from these pools. One chimera (Hu-IFN-a-CH 1 .4) was obtained by randomly screening 

25 individual clones in the library for L929 antiviral activity. Three IFN-a phagemids with 
l: antiviral activity were obtained (one from each pool). The IFN-ct chimeras from these 
phagemids were cloned into the CHO expression vector pDEI- 1011, transfected, and 
purified as described. 



WO 00/52153 



PCT/US00/05764 



Construction of round two libraries 

Five cycle two libraries were constructed by shuffling equimolar qualities of 
. plasmid DNA In the following combinations: CH1.1 x.CHl.2; CH 1.1 x CHI .3; CH1.1 x 
CHI . 4; CH 1.2 x,CH 1.3; CH 1.1 x CH 1.2 x CH 1.3 x.CH 1.4. Shuffled libraries were 
5 made in pDEI-932 from 25-50 bp fragment assemblies as described (Crameri et al, supra.), 

HTP proliferation assays 3 5 

The L929 anti-prbliferative assay Was performed according to standard 3 H 
thymidine incorporation methods. Briefly; IFN-a samples were titrated in triplicate in 5 fold 
dilution steps down the plate. For HTP screening in the second round of shuffling, four 
1 0 single 1 0-fold dilutions were assayed in the primary screen, arid subsequent rescreens were '' 
Hone in triplicate; L929 cells (1000/well) were incubated for 72 hours at 37? C, 5% C02 
incubator. During the last 16 hours of incubation^ 1 uCi/well of *H thymidine Was added. 
The plates were then harvested oh a Harvester-96 (Totiitec) and thymidine incorporation was 
counted on a beta counter (Microbeta, Wallac). ' : . : ^ 

15 • CHO expression and purification of shuffled IFN- as 

IFN-a genes, were cloned into a standard CHQ expression vector (pDEI- 
101 1) in which the E-tag/6-His tag (Pharmacia) is fused to the C-terminus of the IFN-as. 
Expression is driven by the SR-a promoter, and stable transfectants were selected at 1 mg/ml 
G41 8. The four most active clones from the first round and the fifteen most active clones 

20 from the second round were inserted into a pDEI-101 1; introduced into CHO cells by 
transfection (Sambrook et al , supra.), and the proteins were affinity purified from the : 
supernatant on anti-E tag Sepharose (Pharmacia). ' ' • 

Daudi proliferation assays i . v i r 
. , , Eight chimeric phage-displayed IFN-as were sequenced from randomly 

25 picked clones. .Four of the eight sequences encoded in-frame IFN-a p°nes. These four 
chimeras and Hu-IFN-a2a were expressed, purified, and assayed for anti-proliferation 
activity on human Daudi cells. The Daudi antiproliferation assay was done as described 
(Scarozza et al (1992) J. Interferon Res. 12: 35-42). One unit/ml is defined as the 
concentration giving half-maximal inhibition of proliferation. Two thirds of the clones in the 
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cycle two library were more potent than Hu-IFN-a-CH 1 .4 in the HTP L929 antiproliferation 
assay. 



Results And Discussion 

, Two rounds of molecular breeding and screening were performed. In the first 
round, family DNA shuffling by homologous in vitro recombination was used to make a 
: library ; of chimeric Hu-IFN-as. All of the Hu-IFN-a genes, including pseudogenes, were 
shuffled in. order to capture the diversity .of the entire family. Chimeric IFN-as were 
expressed, purified, and screenedtfor L929 antiviral activity as pools of 12 or 96. The active 
pools were deconvoluted into sequentially smaller pools until single active clones were 
identified. Because a pooling strategy was used, a total of only 68 murine antiviral assays 
was used to screen this library of 1672 clones., The most active chimeric IFN-a from round 
one (IFN-a-CHl.l) is derived from six parental Hu-IFN-a gene segments (Figure 1A), and 
is 87-fold more active than Hu-IFN-a 1, the wild type Hu-IFN-a that is most active in 
murine cells (Table 15), The large improvement in activity that was obtained in the first 
round of screening of this shuffled library using only .68 assays has important implications 
for the range of applications of molecular breeding, as discussed below. 

•"■"■•■•*■■ '■ Tab i el5 

Activities of Parental and Evolved IFN-as in Murine Cells 
DNA IFN-aGene Genealogy L129 Fold 

Shuffling antiviral Improvement 

Cyde activity In Activity : 

Units/mgx vs Hu-IFN-o2a 
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% DNA shuffling allows one to use analogs of classical breeding methods and 
to extend breeding in non-classical ways such as by breeding of more than two parental 
genes in a single molecular breeding reaction or breeding of genes from different species 
(Stemmer, W. P. C. (1995) Biotechnology 13: 549-555; Patten et al, { 1996) Current Opinion 
in Biotechnology - 8:724-733; Crameri et dl, supra,). As with classical breeding, the sampling 
of shuffled libraries is generally non-exhaustive. Indeed, the power of breeding is that large 
improvements in phenotype can be achieved by recursively screening only a small subset of 
all theoretically possible progeny (Burbank; L.; "Short-cuts into the centuries to come: better 
plants secured by hurrying evolution;" In Luther BiirbankHis Methods and Discoveries: 
Their Practical Application, Vol. 1, pp. 176-210 (Whitson and Williams, eds; New York: 
Luther Burbank Press, 1914; Haldane, J. B. S. (1924) Cambridge Phil. Sod. Trans. 23: 19- 
4 1 )? It is therefore important to determine the most economical selective molecular breeding 
strategies. ' • '' v ■ •: : : '- >- '• • ' - ■ • >'. ■ 

; The data from this experiment begin to address this important issue. In cycle 
two, we compared breeding strategies by doing pooled and pair- wise matings of the four 
IFN-a genes from round one to make five hew libraries of chimeras. Four libraries were 
made by pair-wise matings of the genes and diie library by pooled mating of all four genes. 
A HTP assay was used to screen 1056 individual clones from this panel of five libraries, and 
the top sixty candidates were rescreened quantitatively for antiviral activity in L929 cells. 
The genes from the eleven most active shuffled IFN-as were expressed in CHO cells and 
purified IFN70: protein was assayed. The most active IFN-a from cycle two is improved 
185-fold relative to Hu-IFN-ai and 285,000-fold relative to Hu-IFN-a2a (Figures 2, 3). 
Remarkably, the activities of the three most active IFN-as exceed the activity of the most 
active native mouse IFN-kx, Mu-IFN-a4 (Table 15, Figure 3). The most active clones from 
round two came from the pair-wise matings of highly active clones (Hu-IFN-a-CHl.l x Hu- 
IFN-a-GH l .3), with none of the most active clones in round two coming from the pooled 
mating (Table 15). The superior performance of pair-wise matings relative to pooled 
matings may reflect sparse sampling of a population withja significantly lower average level 
of biological activity in clones derived from the pooled mating, due to breaking up favorable 
amino acid combinations such as Kl 21 and R125, as discussed below. 
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Libraries of family shuffled IFN-ocs have few inactive or weakly active 
clones. In contrast, random mutagenesis typically leads to a high frequency of gene 
inactivation (Mulled H.J. (l964)Mutat. Res. 1,2-9; Moore et al. (1997) J. Mol. Biol. 
21i\5i6-341). For example, 75% of random point mutants of residues 120-136 of Hu-IFN- 
5 oc4 are inactive (Tymms et al. (1990) Genet. Anal. Tech. Appl. 7(3):53-63). To assess the 
knockout rate in our primary libraries;, we assayed four randomly chosen intact IFN-cc 
chimeric genes from our libraries, in a human cell proliferation assay (Daudi). All four 
shuffled IFN-as are as active in human Daudi cells as is Hu : IFN-a2a, despite having 10 to 
21 amino acid changes relative to the closest native Hu-EFN-a (10; Figure 1; Experimental 

10 protocols). The second round of shuffling in this study gives an additional indication of the 
high quality of shuffled libraries, as two thirds of the clones from the second round of 
shuffling are more active in mouse cells than Hu-IFN-a , the most active native Hu-EFN-a. 
The diversity in the libraries in this study was ovemhelmingly generated by recombination 
of pre-existing natural sequence diversity in the gene family, with random point mutation 

1 5 accounting for only two sequence changes in the four round one chimeras (Figure 1 A). 

These random mutations were removed in the second round of breeding by recombination 
with native gene segments, and thus there were no random point mutations in the three most 
active round two chimeras (Figure IB). 

The dramatic difference between family shuffled libraries and libraries made 

20 by random point mutagenesis can be understood by considering that family shuffling 

permutes blocks of sequence containing conservative amino acid substitutions that have been 
selected for function during millions of years of purifying natural selection (Stemmer, 
supra. t Patten et al., supra., Crameri et al., supra., Muller et al., supra.). Consequently, the 
sequence space defined by recombination of natural diversity is highly pre-selected for 

25 function and represents an infinitesimal fraction of the sequence space accessible by random 
s mutation. For example, the Hu-IFN-a genes differ from each other by an average of 1 7 
residues (Henco et al. (1985) J. Mol. Biol. 185(2):227-60). There are 10 45 17-step random 
mutants of Hu-IFN-a (the number of possible recombinants of the natural Hu-IFN-a 
sequence diversity is as follows: the Hu-IFN-a gene family is variable at 76 sites (Id.). 

30 However there is a very limited range of amino acid changes at these sites. There aire two, 
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three or four amino acid alternatives at 57, 15 and 4 sites, respectiyely (Id.), so the number of 
possible recombinants is 2 57 x 3 15 x 4 4 = 5xl0 26 .), whereas there are 10 26 permutations of the 
natural Hu-EFN-a sequence diversity. Thus, shuffled IFN-as sample only 10" 19 of the 
random point mutant spectrum. In contrast to family shuffled libraries, an infinitesimal 
5 fraction of 17-step random point mutants of Hu-EFN-as are expected to be active (Muller, 
supra.; Moore et al., supra.; Tymms et al, supra.), and these libraries of shuffled chimeras 
are therefore highly enriched for. functional clones relative to libraries made by random point 
. mutagenesis. This result illustrates the striking ability of family shuffling to generate 
progeny that differ from the parent molecules at .many residues, while still retaining potent 

10 biological activity. 

As a consequence of the high average activity of members of the family 
shuffled libraries, direct screening for biological activity is possible. The ability to directly 
screen for the desired biological function rather than using a surrogate screen or selection is 
a significant advantage over other strategies such as phage panning because one can use a 

1 5 small number of complex biological assays to directly obtain clones with the desired 
biological activity. The high quality of family shuffled libraries profoundly affects the 
approaches that can be taken to improving complex genetic traits. Evolution of commercially 
important genes and proteins may be practical even when very complex, time consuming, or 
expensive assays are required. 

20 Immunogenicity is clinically significant for many recombinant 

pharmaceutical proteins (van der Meide and Schellekens (1997) Biotherapy 10(l):39-48; 
Konrad, M f (1989) Tibtech 7:175-179; Allegreta et al. (1986) J. Clin. Immunol. 6:481-490). 
, The ability to evolve proteins with immunologically conservative changes while reducing 
properties that impact immunogenicity such as propensity to unfold, aggregate, or oxidize, 

25 may be useful for reducing immunogenicity. It is typically more difficult to raise antibodies 
against proteins in closely related species because of the similarity of the foreign proteins to 
the native, tolerated protein (Nossal, G. J. V., "Immunologic Tolerance" In Fundamental 
Immunology, Second Edition, 571-586 (William E. Paul, editor, Raven Press Ltd., New 
York, 1989)). We expect that because of the functionally conservative nature of family 

30 shuffling, breeding closely related gene homologues, rather than performing random or site 
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directed mutagenesis, is more litcely to generate immunologically conservative chimeras. 
Undesired T and B cell epitopes can be removed from shuffled clones by back-crossing 
evolved IFN-as with wild type IFN-as and screening for genes which retain high activity, 
but lose immunogenic epitopes. 
5 Classical inbreeding to enhance a particular phehotype can result in loss of 

characteristics in the parentals that are not under selective pressure (Lynch and Wallace, 
Genetics and Analysis of Quantitative Traits (Sinauer Associates Inc., Sunderland, Mass., 
i 998 j. In this study, we selectively bred for activity in murine cells, with no pressure for 
retention of activity on the Hu-IFN-a receptor which is only 49% identical in amino acid 

10 sequence (Uze et al. (1995) J. Interferon Cytokine Res. 15(l):3-26). It was therefore of 
interest to test whether anti-proiiferative activity on human cells was retained by the four 
most active shuffled IFN-as that were bred for high activity in mouse cells. Surprisingly, all 
of these clones retained anti-prbliferative activity in human cells that is within 2-fold of the 
activity of Hu-IFN-a2a (3x l0 7 Units/mg; see Experimental protocols),' whereas none of the 

15 Mu IFN-as had detectable activity in human cells (less than 10" of the activity of Hu-IFN- 
a). This illustrates how family shuffling, by using recombinatibn of functionally 
conservative natural sequence diversity within a gene family rather than random point 
mutation, can allow one to evolve cytokines which retain activity on one receptor while 
gaining activity on a homologous receptor. The ability to evolve pluripotent cytokines may 

20 be useful in the development of novel protein therapeutics, such as for proteins active in 
multiple plants, farm animals, or pathogens. 

Previous engineering of cytokines has relied principally on site-directed 
mutagenesis guided by structural models (Fuh et at. (1992) Science 256:1677-80) and on 
cassette mutagenesis or random mutagenesis (Lowman and Wells (1993) J. Mol. Biol. 

25 234(3):564-7828-29; Thomas et al. (1 995) Proc. Nat 7. Acad. Sci. USA 92(9):3779-83). 
Improving genes by classical stmcrure/function ariai/rs generally relies on measuring the 
effect of single mutations or cassettes of mutations in one context, and then multi-step 
mutants are built up based on the assumption of additivity of combinations of these mutants 
(Fuh et al, supra.; Lowman and Wells, supra.; Thomas et dl., supra.). Consequently^ 

30 combinations of mutations that have non-additive effects are difficult to discover by these 
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methods (Wells, J. A. (1990) Biochemistry 29(37):8509-l 7). Several studies have identified 
residues in chimeric and point mutated Hu-IFN-as that confer activity in murine cells. 
Replacing residues 61 to 92 of Hu-IFN-a8 with those from Hu-IFN-al significantly 
increases the activity in murine cells, and point mutagenesis implicates residues 84, 86, 87 
5 ., and 90 as contributing to this effect (Horisberger and Di Marco (1995) Pharmacol. Ther. 
66(3):507-34). Analysis of a series of 20 chimeras between Hu-IFN-al and Hu : EFN-a2a 
. reveals that sequences in the C-terminal 49 residues are responsible for its unusually high 

.activity in murine cells (Weber et al. (1987) EMBO.J. 6(3):591-8). Further analysis by site- 
- directed mutagenesis reveals that transfer of residues K121 or R125 to Hu-IFN-a2 increases 

10 activity on murine cells, and that together they increase activity by 400-fold (Id.). Based on 
this functional data and on homology modeling, the residues in these two regions (78-95 and 
121-132) have been proposed to interact with the Mu- IFN-ct receptor (Fish, E. N. (1992) J. 
Interferon Res. 12(4):257-66; Uze et al. (1994) J. Mol.Biol. 243(2):245-57). 

K121 and R125, the two residues from Hu-IFN-al which have been shown to 

1 5 : u confer activity in mouse cells when transplanted onto other Hu-IFN-as, occur either 

separately or together in all of our cycle 1 chimeras; and both residues occur together in all 
five of the most active chimeras from cycle two (Figure IB). While the three most active 
chimeras are identical to Hu-IFN-al at five of the six residues that have previously been 
shown to contribute to its activity in mouse cells (Horisberger and Di Marco, supra.', Weber 

20 et al., supra.), they contain 22-28 additional sequence changes relative to Hu-IFN-al 

(Figure 1). This large number of differences from the parental genes is typical of family 
shuffling because blocks of sequence are shuffled in molecular breeding, and thus progeny 
sequences generally have many, amino acid differences from the closest parental molecules. 
An important consequence of this feature of family shuffling is that complex improvements 

25 do not need to be built up in multiple rounds of mutation or by using powerful selection 

methods on large libraries. These clones are improved by up to 285,000-fold relative to Hu- 
IFN-a2a, an additional 500-fold increase in activity relative to the K121, R125 double 
, mutant (Weber et al, supra.). The three most active chimeras in this report are more active 
in purine cells than any chimeras or point mutants reported in any previous studies 

30 (Horisberger and Di Marco, supra.; Weber et al, supra.), and are the first examples of Hu- 
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IFN-a variants that are more active than the native Mu-IFN-as.' This study illustrates the 
utility and novel aspects of DNA shuffling for recruiting, from gene families, segments of 
genes that confer or enhance a novel biological activity, and for sequentially optimizing 
them by molecular breeding, without a priori guidance from structural or functional 
5 information. 

In summary, molecular breeding of IFN-a genes from one species and a 
modest number of cell-based assays allowed us to rapidly obtain recombinants with potent 
IFN-a activity on a distantly related species. This suggests that diverse mammalian 
honiologues of human cytokines can be used as breeding stock from which to evolve 

10 cytokines that are more active or have superior selectivity profiles than native cytokine 
genes. For example, it may be possible to evolve Hu-IFN-as with reduced side effects 
(Dusheiko, G. (1997) Hepatology 26(3 Suppl 1):112S-121S; Vial and Descotes (1994) Drug 
Experience 10 (2): 115-150; Funke et al. (1995) Ann. Hematol. 68(l):49-52; Schomberg et 
al. (1993) J. Cancer Res. Clin. Oncol. 119(12):745-55), improved anti-tumor activity in 

15 humans (Gutterman et al. (1994) Proc. Nat 'l. Acad. ScL USA 91(4): 1 198-205), or IL-2 
variants with reduced toxicity (Dushieko, supra.). 

Using molecular breeding, one can dramatically accelerate the rate of out- 
crossing or back-crossing genes, and one can focus on a single gene, allowing one to 
improve traits much more rapidly than is possible with classical breeding. Molecular 

20 breeding also allows one to generalize the principles of classical breeding by simultaneously 
breeding large gene families and by breeding genes from different species. This technology, 
therefore, unites the precision, rapidity and scalability of molecular techniques with the 
principles of classicalbreeding. While it has required many generations of classical : 
selective breeding of wild strains to optimize commercial plant and animal varieties, only a 

25 few cycles of in vitro selective molecular breeding are required to optimize existing gene 
families for new phenotypes (Stemmer, si<pra.,:Patten eral., supra., Crameri et al., supra.). 
The high quality of the libraries makes it practical to identify improved clones by screening 
in complex, time-consuming or expensive biological assays. This provides a more effective 
route to discovering desired activities than genomics-based approaches to searching for 

30 potent activities of interest in existing genomes. Molecular breeding technology greatly 
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enhances our ability to utilize the wealth of sophisticated genetic diversity accumulated 
during billions of years of biological evolution. < ( . 

Example 2 

EVOLUTION OF A LIGAND FOR AN ORPHAN CHEMOKINE RECEPTOR 

5 This Example describes a procedure by which one can obtain a ligand for an 

orphan receptor. The procedure is useful when, for example, one has identified a gene that 
exhibits homology to a known member of a known receptor family, but no ligand is known 
that has high activity on the putative receptor that is encoded by the gene. For purposes of 
illustration, the evolution of a ligand for an orphan receptor that resembles the CCR5 
10 chemokine receptor is described in this Example. It will be appreciated by those of skill in 
. the art that one could readily adapt this protocol for use to obtain ligands for other orphan 
receptors. 

A gene is identified that encodes a receptor that exhibits homology to the 
CCR5 receptor. No ligand is, known that strongly modulates the receptor encoded by the 
15 gene, and either weak crossreactivity or no measurable activity on the receptor is exhibited 
by a natural ligand of CCR5 {e.g., RANTES (regulated upon activation, normal T-cell 
expressed and secreted)). It is desired to obtain a ligand that has high activity on this orphan 
receptor. 

4 DNA Shuffling! of Natural Ligands for CCR5 
20 ~s One or morenatural ligands for. the CCR5 receptor are used as the starting 

point for DNA shuffling; Nucleic acids that encode-human RANTES* for example, are 
fragmented and subjected to shuffling with nucleic acids that encode other CCR5 ligands. In 
one embodiment, family shuffling is employed in which the human RANTLS-encoding 
nucleic acids are shuffled with nucleic acids that encode all or part of human homologs of 
25 RANTES; such as "MIP- la (macrophage inflammatory: protein-la) MIP7I p. - - . 

Alternatively, or additionally, nucleic acids that encode human RANTES are shuffled with 
< RANTES homologs from omer mammals/; r v c ; , r 
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Screening for Activity on Orphan Receptor 

The shuffled nucleic acids are then expressed and the resulting shuffled 
ligands are tested for activity on the orphan receptor. Conveniently, a reporter cell line is 
constructed in which a reporter gene, such as a luciferase gene, is placed under the control of 

.5 a response element for the orphan receptor. In some embodiments, the ligand binding 

domain of the orphan receptor is attached to a DNA binding domain of a receptor for which 
a response element is known (e.g., a GAL4 receptor), and the reporter gene is linked to the 
corresponding response element (e.g., a GAL4 UAS). 

Shuffled ligands that activate or repress the receptor activity are selected for 

1 0 further analysis and/or additional shuffling. By repeating the shuffling one or more times and 
after each cycle selecting for the desired activity, one can obtain a shuffled ligand that has a 
high degree of the desired activity. - 

Use of Shuffled Ligand 

Shuffled ligands for the orphan receptor are useful for several purposes. For 
1 5 example, the evolved ligands are useful for studies of the pathways that are mediated by the 
receptors, the ligands can be used in assays to screen for antagonists of receptor activation 
(e.g , an evolved ligand that activates an orphan receptor and results in expression of 
luciferase can be used in a screening assay to identify a molecule that inhibits the activation 
of the receptor). 

20 

It is understood that the examples and embodiments described herein are for 
illustrative purposes only and that various modifications or changes in light thereof will be 
suggested to persons skilled in the art and are to be included within the spirit arid purview of 
this application and scope of the appended claims. All publications, patents, and patent 
25 applications cited herein are hereby incorporated by reference for all purposes. ' 
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WHAT IS CLAIMED IS : 



1 1. A method for obtaining a surrogate ligand for an orphan receptor, the 

2 method comprising: 

3 (1) creating a library of recombinant polynucleotides; and 

4 (2) screening the library to identify a recombinant polynucleotide that 

5 encodes a surrogate ligand that can specifically bind to a ligand binding domain of the 

6 . orphan receptor. 

1 2. The method of claim 1, wherein the library is obtained by recombining 

2 at least first and second forms of a nucleic acid, each of which forms encodes a ligand for a 

3 member of a receptor family, or a fragment of said ligand, wherein the first and second 

4 forms differ from each other in two or more nucleotides, to produce a library of recombinant 

5 nucleic acids. - . ;>.-'•,.<..<•.-, - . ; 

1 3. The method of claim 2, wherein the method further comprises: 

2 (3) recombining at least one recombinant polynucleotide that encodes a 

3 surrogate ligand that can specifically bind to a ligand binding domain of the orphan receptor 

4 with a further form of the nucleic acid, which is the same or different from the first and 

5 second forms, to produce a further library of recombinant polynucleotides; 

6 - (4) screening the further library to identify at least one further 

7 optimized recombinant polynucleotide that encodes a surrogate ligand that can specifically 

8 bind to a ligand bmding domain of the orphan receptor, and 

9 (5) repeating (3) and (4), as necessary, until the surrogate ligand 

10 encoded by the further optimized recombinant polynucleotide exhibits an enhanced ability to 

1 1 specifically bind to the ligand binding domain of the orphan receptor. 



1 4. The method of claim 2, wherein the orphan receptor exhibits homology 

2 to at least one member of the receptor family. 
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1 5. The method of claim 4, wherein the homology is evidenced by an amino 

2 acid sequence of one or more domains of the orphan receptor being at least 60% identical to 

3 the amino acid sequence of a corresponding domain of at least one member of the receptor 

4 family. 

1 6. The method of claim 5, wherein the amino acid sequence of one or more 

2 domains of the orphan receptor is at least 70% identical to the amino acid sequence of a 

3 corresponding domain of at least one member of the receptor family. 

1 7. the method of claim 4, wherein the homology is evidenced by a 

2 primary sequence motif of a receptor family being present in the orphan receptor: 

1 8. The method of claim 4, wherein the homology is evidenced by a 

2 structural motif of a receptor family being present in the orphan receptor. 

1 9. The method of claim 1 , wherein the surrogate ligand exhibits an agonist 

2 ; function upon binding to the ligand binding domain of the orphan receptor. 

1 10. ? The method of claim 9, wherein the screening comprises expressing the. 

2 library of recombinant polynucleotides, and contacting the resulting library of candidate 

3 surrogate ligands with a test cell that comprises a fusion polypeptide which comprises: a) an 

4 extracellular domain of the orphan receptor; and b) a cytoplasmic domain of a second 

5 receptor, whereby the binding of a ligand to the extracellular domain results in a detectable 

6 effect on the test cells. 



11. The method of claim 1 0, whereinthe second receptor is a cytokine 

2 receptor. 

1 12. The method of claim 1 1 , wherein the second receptor is selected from; 

2 the group consisting of an interleukin receptor, an interferon receptor, a chemokine receptor, 
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3 a hematopoietic growth factor receptor, a tumor necrosis factor receptor, and a transforming 

4 growth factor. , .. . . . ( 

1 n . 13. The method of claim 10, wherein the second receptor is a human 

2 . receptor. 

1 14. The method of claim 10, wherein the detectable effect is induction or 

2 inhibition of proliferation of the test cell. 

1 15. The method of claim 9, wherein the screening comprises: 

2 expressing the library of recombinant polynucleotides to obtain a library 

3 of candidate surrogate ligands; 

4 contacting the candidate surrogate ligands with a test cell that 

5 comprises: . , ; .. 

6 . a) a fusion polypeptide comprising: 1) a ligand binding domain of 

7 the orphan receptor; and 2) a DNA binding domain of a second 

8 ' receptor;, and 

9 b) a reporter gene construct which comprises a response element to 

10 which the DNA binding domain can bind, wherein the response 

11 ' element is operably linked to a promoter that is operative in the. 

12 ; r v cell and the promoter is operably linked to a reporter -gene; and 

13 determining whether the reporter gene is -expressed at a higher or lower 

14 level in the presence of a candidate surrogate ligand compared to expression in the absence . 

15 of the candidate surrogate ligand. ' ; 

1 16. The method of claim 1 5, wherein the test cells are contacted with a 

2 standard amount of each candidate surrogate ligand. 

1 17. The method of claim 15, wherein the DNA binding domain is a Gal4 

2 DNA binding domain. ^ r 
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1 18. The '-method of 'claim : 1-5; wherein the second receptor is selected from 

2 the group consisting of an estrogen receptor, a progesterone receptor, a glucocorticoid 

3 receptor, an androgen receptor, a mineralcorticoid receptor, a vitamin D receptor, a retinoid 

4 receptor, and a thyroid hormone receptor. 

1 19: The rhethod ofclaim l, wherein the s library is subdivided into a plurality 

2 of pools, each of which pools is screened to identify one or more positive pools that include 

3 a recombinant polynucleotide that encodes a surrogate ligand that can specifically bind to a 

4 ligand binding domain of the orphan receptor. 

1 ' • 20. The method of claim 19, wherein the recombinant polynucleotides in a 

2 positive pool are subjected to further recombination and screening. 

1 2 1 . The method of claim i 9, wherein the recombinant polynucleotides in a 



2 positive pool are further subdivided into a plurality of subpools, each of which subpools is 

3 screened to identify one or more positive subpools that include a recombinant polynucleotide 

4 that encodes a surrogate ligand that can specifically bind to a ligand binding domain of the 

5 ' orphan receptor. 



1 22; A method of identifying a compound that modulates activity of an 

2 orphan receptor, the method comprising: 

3 obtaining a surrogate ligand for the orphan receptor by: 

4 ^ (1) creating a 4 library of recombinant polynucleotides; and 

5 (2) screening the library to identify a recombinant polynucleotide 

6 that encodes a surrogate ligand that can specifically bind to a 

7 ligand binding domain of the orphan receptor; 

8 contacting the surrogate ligand with a polypeptide that comprises the 

9 ligand binding domain of the orphan receptor in the presence of a potential modulator 
10 compound; and 
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1 1 determining whether the activity of polypeptide is increased or 

12 decreased compared to the activity of the polypeptide in the absence of the potential 

13 modulator compound. r;.o .... , ." ; . 

1 23. The method of claim 22, wherein the polypeptide is a fusion 

2 polypeptide that comprises: a) a ligand binding domain of the orphan receptor; and b) a 

3 . cytoplasmic, domain of a second receptor, whereby the binding of a ligand to the 

4 ,.. extracellular domain results in a detectable effect on the test cells. . 

1 24. The method of claim 23, wherein the second receptor is selected from 

2 the group consisting of an estrogen receptor, a progesterone receptor, a glucocorticoid 

3 receptor, an androgen receptor, a mineralcorticoid receptor, a vitamin D receptor, a retinoid 

4 receptor, and a thyroid hormone, receptor,. . . , , . 

1 25. The method of claim 22, wherein the polypeptide is a fusion 

2 polypeptide that comprises: a) a ligand binding domain of the orphan receptor; andb)a 

3 DNA binding domain of a second receptor; , , 

4 .* and the activity of the fusion polypeptide is detennined by contacting 

5 the polypeptide with a reporter gene construct which comprises a response element to which 

6 the DNA binding domain can bind, wherein the response element is operably linked to a 

7 promoter that is operative in the cell and the promoter is operably linked to a reporter «ene; 

8 and - ■ 1 ■ - • . ■ - ■' 

9 determining whether. the reporter gene is expFessed at a higher or lower 

1 0 level in the presence of a potential modulator compound compared to the expression level in 

11 -the absence of the potential modulator compound. 

1 26. , The method of claim 25, wherein the second receptor is a GAL4 

2. .receptor.. .. ■ • . : . • . . ■, .■ 
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